## Cascaded Subgroups Discovery with an Application to Regression

Citations: | 2 - 0 self |

### BibTeX

@MISC{Grosskreutz_cascadedsubgroups,

author = {Henrik Grosskreutz},

title = {Cascaded Subgroups Discovery with an Application to Regression},

year = {}

}

### OpenURL

### Abstract

Abstract. Subgroup discovery is a task from the area of Knowledge Discovery in Databases (KDD) that aims at finding interesting subgroups of a population. One problem with subgroup discovery algorithms is that many of them return a very high number of subgroups, including many redundant ones. In this paper, we present an approach to iteratively build up a set of subgroups for a numerical target attribute. The result is a additive representation of the patterns in the dataset, which can also be used as a regression model. The iterative scheme presented is similar to Transformation-Based Regression (TBR), an algorithm from the area of rule-based regression. While this is work in progress, first experiments show that the resulting sets of subgroups have a predictive accuracy that is similar to that of models generated by TBR, while the models are much more compact and arguably easier to interpret. 1

### Citations

4164 |
Classification and regression trees
- Breiman, Friedman, et al.
- 1984
(Show Context)
Citation Context ...ditive and a multiplicative component. Other approaches to rule based regression use pseudo-classes, which essentially corresponds to a discretization of the numerical target [WI95]. Regression trees =-=[BFOS84]-=- build up tree-structured prediction models that are, similar to rule-based learners, relatively simple to interpret. The idea to iteratively search for subgroups, masking the effects of the subgroups... |

1204 | Mining frequent patterns without candidate generation
- Han, Pei, et al.
- 2000
(Show Context)
Citation Context ...butes number of examples autos 10 199 housing 13 506 servo 4 167 solar flare 10 1066 Table 3. Datasets We won’t go into the details of our implementation except to remark that it made use of FP-Trees =-=[HPY00]-=- and (relatively simple) optimistic estimates [Wro97] to speedup the computation of the subgroups (see also the paragraph on fast subgroup discovery in the related work section). More details can be f... |

788 | Transformation-Based Error –Driven Learning and Natural Language Processing : A Case Study in Part-of-Speech Tagging
- Brill
- 1995
(Show Context)
Citation Context ... As already mentioned in the introduction, our algorithm is quite similar to Transformation-Based Regression (TBR) [BKN + 02], a rule-based regression technique based on Transformation Based Learning =-=[Bri95]-=-. TBR iteratively builds up a prediction model by refining the model by means of transformation rules. More specifically, in the i+1-th iteration a transformation rules takes as input the prediction o... |

696 |
UCI Machine Learning Repository
- Asuncion, Newman
- 2007
(Show Context)
Citation Context ... mechanisms for price calculation. As mentioned earlier, this is work in progress. We present first results obtained by applying our algorithm to four (slightly modified) datasets from UCI Repository =-=[AN07]-=-. The experiments show that our simple additive model, based on a set of subgroups, is much more compact than the representation obtained using TBR while it has a similar predictive accuracy. We also ... |

682 | The cascade-correlation learning architecture
- Fahlman, Lebiere
- 1990
(Show Context)
Citation Context ...p a model by iteratively refining the prediction made in earlier stages is also underlying the technique of Boosting [FS99]. This idea is also related to the Cascade-Correlation Learning Architecture =-=[FL90]-=-, which iteratively builds up a neural network layer by layer, where each layer builds on the previous, unmodified layers. Of course, there is also a large body of work on regression respectively func... |

611 | A short introduction to boosting
- Freund, Schapire
- 1999
(Show Context)
Citation Context ...and TBR) consider the task of numeric regression. The idea to incrementally build up a model by iteratively refining the prediction made in earlier stages is also underlying the technique of Boosting =-=[FS99]-=-. This idea is also related to the Cascade-Correlation Learning Architecture [FL90], which iteratively builds up a neural network layer by layer, where each layer builds on the previous, unmodified la... |

380 | Learning decision lists
- Rivest
- 1987
(Show Context)
Citation Context ... question how to calculate a prediction from a set of overlapping subgroups can also be answered in other ways: One possibility is to interpret the sequence of subgroups as an (ordered) decision list =-=[Riv87]-=- and to only consider the first matching subgroup description. Our additive interpretation, where the order of the subgroup descriptions is irrelevant, has the advantage that it is similar to familiar... |

125 |
Explora: a multipattern and multistrategy discovery assistant
- Klösgen
- 1996
(Show Context)
Citation Context ... − m0) where m and m0 denote the mean in the subgroup and in the overall population, respectively, while n denote the size of the subgroup. This quality function is based on the statistical mean test =-=[Klö96]-=- and thus has a neat formal foundation. Now the problem of subgroup discovery is defined as follows: Given a database DB, the quality function q(DB, sd) := √ n(m−m0), and a number k, determine the k s... |

90 | Extracting tree-structured representations of trained networks - Craven, Shavlik - 1997 |

87 | A Statistical Theory for Quantitative Association Rules
- Aumann, Lindell
(Show Context)
Citation Context ...tions [SW00]. Other approaches to discover interesting pattern in numeric attributes include the search for “impact rules” proposed by Webb [Web01], which builds on earlier work by Aumann and Lindell =-=[AL99]-=-. Impact rules are quite similar to numeric subgroups: they consist of an antecedent (a conjunction of conditions) and a consequent, which describes the impact on the target variable. As done in stand... |

69 | Statistical fraud detection: A review - Bolton, Hand - 2002 |

52 | Improvements to the SMO algorithm for SVM regression
- Shevade, Keerthi, et al.
(Show Context)
Citation Context ...error (RMSE) in a 10-fold cross-validation. We compared the results with those achieved by TBR and by a state-of-the-art SVM regression algorithm, namely the improved SMO Algorithm for SVM Regression =-=[SKBM00]-=- 7 . We considered both a linear and a quadratic Kernel. The results are shown in Table 7: 6 Closed subgroups are computed by calculating all subgroups and leaving only one subgroup description for ev... |

32 | Discovering Association with Numeric Variables
- Webb
- 2001
(Show Context)
Citation Context ...uaranteeing precise bounds on confidence and quality of solutions [SW00]. Other approaches to discover interesting pattern in numeric attributes include the search for “impact rules” proposed by Webb =-=[Web01]-=-, which builds on earlier work by Aumann and Lindell [AL99]. Impact rules are quite similar to numeric subgroups: they consist of an antecedent (a conjunction of conditions) and a consequent, which de... |

18 | Spatial subgroup mining integrated in an object-relational spatial database
- Klösgen, May
(Show Context)
Citation Context ...teresting patterns in the data. Subgroup Discovery is a general approach that has shown to be useful in a variety of application scenarios (like medical consultation systems [ABP06], spatial analysis =-=[KM02]-=- and marketing campaign planning [LCGF04]). One problem with subgroup discovery algorithms is that many of them return a very high number of subgroups. This is particularly true for algorithms that ex... |

15 | Tight optimistic estimates for fast subgroup discovery
- Grosskreutz, Rping, et al.
- 2008
(Show Context)
Citation Context ... enhance the performance. For another, the use of optimistic estimates [Wro97] has been shown to allow to prune large parts of the search space and thus significantly increase the overall performance =-=[GRW08]-=-. Finally, randomized approaches have been proposed that allow to search for subgroups by considering only a sample of the overall dataset, while guaranteeing precise bounds on confidence and quality ... |

10 |
SD-map - a fast algorithm for exhaustive subgroup discovery
- Atzmüller, Puppe
- 2006
(Show Context)
Citation Context ...ch can become quite time consuming. Thus, it depends on algorithms that quickly perform subgroup discovery. Fortunately, recently several approaches have been proposed to speed up this task: For one, =-=[AP06]-=- proposed the use of efficient data structures based on FP-Trees [HPY00] to enhance the performance. For another, the use of optimistic estimates [Wro97] has been shown to allow to prune large parts o... |

10 |
Flach P and Todorovski L: Subgroup discovery with CN2-SD
- Lavrac, Kavsek
(Show Context)
Citation Context ...iction can be calculated by about two summations. 5.2 Predictive Measures Regression Performance Despite the fact that optimizing prediction performance is not the primary goal of subgroup discovery (=-=[LKFT04]-=-), we measured the predictive performance of our algorithm. More precisely, we calculated the root mean squared error (RMSE) in a 10-fold cross-validation. We compared the results with those achieved ... |

8 | Asequential sampling algorithm for a general class of utility criteria
- Scheffer, Wrobel
(Show Context)
Citation Context ...ndomized approaches have been proposed that allow to search for subgroups by considering only a sample of the overall dataset, while guaranteeing precise bounds on confidence and quality of solutions =-=[SW00]-=-. Other approaches to discover interesting pattern in numeric attributes include the search for “impact rules” proposed by Webb [Web01], which builds on earlier work by Aumann and Lindell [AL99]. Impa... |

7 | Introspective Subgroup Analysis for Interactive Knowledge Refinement
- Baumeister, Atzmueller, et al.
- 2006
(Show Context)
Citation Context ...description of the most interesting patterns in the data. Subgroup Discovery is a general approach that has shown to be useful in a variety of application scenarios (like medical consultation systems =-=[ABP06]-=-, spatial analysis [KM02] and marketing campaign planning [LCGF04]). One problem with subgroup discovery algorithms is that many of them return a very high number of subgroups. This is particularly tr... |

6 | Knowledge-Based Sampling for Subgroup Discovery
- Scholz
- 2005
(Show Context)
Citation Context ... covered. The same weighting scheme has also been used in in other subgroup discovery systems like APRIORISD [KLJ03]. Another approach to take into account information or subgroups proposed by Scholz =-=[Sch04]-=- is to make use of sampling. Please note, however, that all these approaches consider the task of classification, while our algorithm (and TBR) consider the task of numeric regression. The idea to inc... |

4 | ITER: an algorithm for predictive regression rule extraction - Huysmans, Baesens, et al. - 2006 |

4 |
Zytkow, editors. Handbook of data mining and knowledge discovery, chapter 16.3: Subgroup discovery
- Klösgen, M
- 2002
(Show Context)
Citation Context ...n this paper, we consider the problem to construct a representative set of subgroups, that is we tackle what Klösgen called the “more ambitious [goal to construct] a best global system of subgroups” (=-=[KZ02]-=-, Chapter 5.2). More precisely, we aim at building sets of subgroups that make up a good representation for numerical target attributes (like prices, costs or salaries). The case of numericalsubgroup... |

3 |
S.: Optimistic estimate pruning strategies for fast exhaustive subgroup discovery
- Springer, Rüping, et al.
(Show Context)
Citation Context ...latively simple) optimistic estimates [Wro97] to speedup the computation of the subgroups (see also the paragraph on fast subgroup discovery in the related work section). More details can be found in =-=[GRSW08]-=-. Using this implementation, for every dataset in Table 3 we could calculation a subgroup-based model in less than 30 seconds on an Intel Core 2 Duo E8400 with 3 GB of RAM under Windows XP. 5.1 Descri... |

1 |
Branko Kavsek, and Ljupco Todorovski. Adapting classification rule induction to subgroup discovery
- Lavrac, Flach
- 2002
(Show Context)
Citation Context ... interpret. The idea to iteratively search for subgroups, masking the effects of the subgroups already discovered in the subsequent iterations, was also the base for the Algorithm CN2-SD presented in =-=[LFKT02]-=-. To considers different parts of the instance space in each iteration, CN2-SD uses a weighted covering algorithm which assigns a smaller weight to examples already covered. The same weighting scheme ... |