## Propositionalization-based relational subgroup discovery with RSD (2006)

Venue: | Machine Learning |

Citations: | 24 - 5 self |

### BibTeX

@INPROCEEDINGS{Lavrač06propositionalization-basedrelational,

author = {Filip ˇzelezn´y Nada Lavrač},

title = {Propositionalization-based relational subgroup discovery with RSD},

booktitle = {Machine Learning},

year = {2006},

pages = {33--63}

}

### OpenURL

### Abstract

Abstract Relational rule learning algorithms are typically designed to construct classification and prediction rules. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction. The proposed approach was successfully applied to standard ILP problems (East-West trains, King-Rook-King chess endgame and mutagenicity prediction) and two real-life problems (analysis of telephone calls and traffic accident analysis).

### Citations

5438 |
C4.5: Programs for Machine Learning
- Quinlan
- 1993
(Show Context)
Citation Context ...arning tasks (with best results indicated in bold). The table also provides 10-fold stratified cross-validation accuracy results of applying the J48 propositional learner (a reimplementation of C4.5 (=-=Quinlan, 1993-=-) available in WEKA (Witten & Frank, 1999)), supplied with propositionalized data based on feature sets of varying size obtained from the two propositionalization systems. To test the performance of t... |

3382 |
Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations
- Witten, Frank
- 1999
(Show Context)
Citation Context ...dicated in bold). The table also provides 10-fold stratified cross-validation accuracy results of applying the J48 propositional learner (a reimplementation of C4.5 (Quinlan, 1993) available in WEKA (=-=Witten & Frank, 1999-=-)), supplied with propositionalized data based on feature sets of varying size obtained from the two propositionalization systems. To test the performance of the two systems, producing the accuracy re... |

1096 | Inductive Logic Programming
- Muggleton
- 1992
(Show Context)
Citation Context ...es Using relational background knowledge in the process of hypothesis construction is a distinctive feature of relational data mining (Dˇzeroski & Lavrač, 2001) and inductive logic programming (ILP) (=-=Muggleton, 1992-=-;Lavrač&Dˇzeroski, 1994). In propositional learning the idea of augmenting an existing set of attributes with new ones is known as constructive induction. The problem of feature construction has been ... |

1057 | Fast Effective Rule Induction - Cohen |

879 | Learning Logical Definitions from Relations
- Quinlan
- 1990
(Show Context)
Citation Context ...aper is to discover subgroups that are sufficiently large and biased towards one of the two classes: East and West. KRK. In the chess endgame domain White King and Rook versus Black King, taken from (=-=Quinlan, 1990-=-) (first described in Muggleton et al. (1989)), the target relation illegal(A, B, C, D, E, F) states whether a position where the White King is at file and rank (A, B), the White Rook at (C, D) and th... |

807 | The CN2 induction algorithm
- Clark, Niblett
- 1989
(Show Context)
Citation Context ...Propositionalization . Feature construction . Subgroup discovery 1. Introduction Classical rule learning algorithms are designed to construct classification and prediction rules (Michie et al., 1994; =-=Clark & Niblett, 1989-=-; Cohen, 1995). The goal of these predictive induction algorithms is to induce classification/prediction models consisting of a set of rules. On the other hand, opposed to model induction, descriptive... |

663 | Inverse entailment and Progol
- Muggleton
- 1995
(Show Context)
Citation Context ...able consisting of truth values of first-order features, computed for each individual. 4.1. First-order feature construction RSD accepts feature language declarations similar to those used in Progol (=-=Muggleton, 1995-=-). A declaration lists the predicates that can appear in a feature, and to each argument of a predicate a type and a mode are assigned. In a correct feature, if two arguments have different types, the... |

506 |
Fast discovery of association rules
- Agrawal, Mannila, et al.
- 1996
(Show Context)
Citation Context ...e Raedt & Dehaspe, 1997; Wrobel&Dˇzeroski, 1995) aim to discover patterns described in the form of individual rules. Descriptive induction algorithms include association rule learners (e.g., APRIORI (=-=Agrawal et al., 1996-=-)), clausal discovery systems (e.g., CLAUDIEN (De Raedt & Dehaspe, 1997; DeRaedtetal.,2001)), and subgroup discovery systems (e.g., MIDOS Editors: Hendrik Blockeel, David Jensen and Stefan Kramer F. ˇ... |

387 | Toward Optimal Feature Selection
- Koller, Sahami
- 1996
(Show Context)
Citation Context ...new ones is known as constructive induction. The problem of feature construction has been studied extensively (Pagallo & Haussler, 1990; Cohen & Singer, 1991; Oliveira & SangiovanniVincentelli, 1992, =-=Koller & Sahami, 1996-=-;Geibel&Wysotzki,1996). A first-order counterpart of constructive induction is predicate invention (see e.g., Stahl, 1996 for an overview of predicate invention in ILP). Springers38 Mach Learn (2006) ... |

341 | Rule induction with CN2: Some recent improvements
- Clark, Boswell
- 1991
(Show Context)
Citation Context ... the previous section) is constructed. The covering algorithm then invokes a new rule learning iteration on the training set from which all the covered examples 4 Alternatively, the Laplace estimate (=-=Clark & Boswell, 1991-=-) andthem-estimate (Cestnik, 1990; Dˇzeroski et al., 1993) could also be used. 5 When inducing an ordered list of rules (a decision list (Rivest, 1987)), the heuristic search procedure finds the best ... |

229 | Levelwise search and borders of theories in knowledge discovery
- Mannila, Toivonen
- 1997
(Show Context)
Citation Context ...based data mining framework for subgroup discovery Inductive databases (Imielinsky & Mannila, 1996) provide a database framework for knowledge discovery in which the definition of a data mining task (=-=Mannila & Toivonen, 1997-=-) involves the specification of a language of patterns and a set of constraints that a pattern has to satisfy with respect to a given database. In constraint-based data mining (Bayardo, 2002) the cons... |

204 |
Boolean feature discovery in empirical learning
- Pagallo, Haussler
- 1990
(Show Context)
Citation Context ...94). In propositional learning the idea of augmenting an existing set of attributes with new ones is known as constructive induction. The problem of feature construction has been studied extensively (=-=Pagallo & Haussler, 1990-=-; Cohen & Singer, 1991; Oliveira & SangiovanniVincentelli, 1992, Koller & Sahami, 1996;Geibel&Wysotzki,1996). A first-order counterpart of constructive induction is predicate invention (see e.g., Stah... |

189 | Clausal discovery - Raedt, Dehaspe - 1997 |

176 |
Estimating probabilities: A crucial task in machine learning
- Cestnik
- 1990
(Show Context)
Citation Context ...e covering algorithm then invokes a new rule learning iteration on the training set from which all the covered examples 4 Alternatively, the Laplace estimate (Clark & Boswell, 1991) andthem-estimate (=-=Cestnik, 1990-=-; Dˇzeroski et al., 1993) could also be used. 5 When inducing an ordered list of rules (a decision list (Rivest, 1987)), the heuristic search procedure finds the best rule body for the current set of ... |

155 | Theories for mutagenicity: a study of first-order and feature based induction - Srinivasan, Muggleton, et al. - 1996 |

155 | An algorithm for multi-relational discovery of subgroups
- Wrobel
- 1997
(Show Context)
Citation Context ...ail: zelezny@fel.cvut.cz N. Lavrač Institute Joˇzef Stefan, Ljubljana, Slovenia, and Nova Gorica Polytechnic, Nova Gorica, Slovenia e-mail: nada.lavrac@ijs.si Springers34 Mach Learn (2006) 62: 33–63 (=-=Wrobel, 1997-=-; Wrobel, 2001), EXPLORA (Kloesgen, 1996) and SubgroupMiner (Kloesgen & May, 2002)). This paper investigates relational subgroup discovery. As in the MIDOS relational subgroup discovery system, a subg... |

116 | editors. Relational data mining - Džeroski, Lavrač - 2001 |

103 |
Propositionalization approaches to relational data mining
- Kramer, Lavrač, et al.
- 2000
(Show Context)
Citation Context ...tructive induction is predicate invention (see e.g., Stahl, 1996 for an overview of predicate invention in ILP). Springers38 Mach Learn (2006) 62: 33–63 Propositionalization (Lavrač &Dˇzeroski, 1994; =-=Kramer et al., 2001-=-) is a special case of predicate invention enabling the representation change from a relational representation to a propositional one. It involves the construction of features from relational backgrou... |

66 | Feature construction with Inductive Logic Programming: a study of quantitative predictions of biological activity aided by structural attributes.” Data Mining and Knowledge Discovery
- Srinivasan, King
- 1999
(Show Context)
Citation Context ... 1996), stochastic predicate invention (Kramer et al., 1998) and predicate invention achieved by using a variety of predictive learning techniques to learn background knowledge predicate definitions (=-=Srinivasan & King, 1996-=-). Earlier approaches, that are closely related to our propositionalization approach, are those used in LINUS (Lavrač &Dˇzeroski, 1994), and those reported by Zucker and Ganascia (1996, 1998) and Seba... |

62 | Subgroup discovery with CN2-SD - Lavrač, Kavšek, et al. |

52 | Robust classification systems for imprecise environments
- Provost, Fawcett
- 2001
(Show Context)
Citation Context ... following iterations of the weighted covering algorithm. 6 3.6. Subgroup evaluation and WRAcc interpretation in the ROC space Each subgroup describing rule corresponds to a point in the ROC space 7 (=-=Provost & Fawcett, 1998-=-), which is used to show classifier performance in terms of false positive rate FPr (the X-axis) and true positive rate TPr (the Y-axis). In the ROC space, rules/subgroups whose TPr/FPr tradeoff is cl... |

43 | Comparative evaluation of approaches to propositionalization - Krogel, Rawles, et al. - 2003 |

42 | Induction in Noisy Domains
- Clark, Niblett
- 1987
(Show Context)
Citation Context ...rocedure (the covering algorithm) that repeatedly executes the search in order to induce a set of rules (described in Sections 3.4 and 3.5). Let us consider a standard propositional rule learner CN2 (=-=Clark & Niblett, 1987-=-; Clark & Niblett, 1989). Its search procedure used in learning a single rule performs beam search using classification accuracy of a rule as a heuristic function. The accuracy 3 of an induced rule of... |

41 |
Inductive logic programming for knowledge discovery in databases
- Wrobel
- 2001
(Show Context)
Citation Context ...el.cvut.cz N. Lavrač Institute Joˇzef Stefan, Ljubljana, Slovenia, and Nova Gorica Polytechnic, Nova Gorica, Slovenia e-mail: nada.lavrac@ijs.si Springers34 Mach Learn (2006) 62: 33–63 (Wrobel, 1997; =-=Wrobel, 2001-=-), EXPLORA (Kloesgen, 1996) and SubgroupMiner (Kloesgen & May, 2002)). This paper investigates relational subgroup discovery. As in the MIDOS relational subgroup discovery system, a subgroup discovery... |

40 | Expert-guided subgroup discovery: Methodology and application
- Gramberger, Lavrač
(Show Context)
Citation Context ... including functions and recursive predicate definitions. Exception rule learning (Suzuki, 2004) also deals with finding interesting population subgroups. Recent approaches to subgroup discovery, SD (=-=Gamberger and Lavrač, 2002-=-) and CN2-SD (Lavračetal.,2004), aim at overcoming the problem of inappropriate bias of the standard covering algorithm. Like the RSD algorithm, they use a weighted covering algorithm and modify the s... |

38 | Using rule sets to maximize ROC performance
- Fawcett
- 2001
(Show Context)
Citation Context ...astogi, 2000), rule induction with constraints in relational domains including propositionalization (Aronis & Provost, 1994; Aronis et al., 1996), and using rule sets to maximize the ROC performance (=-=Fawcett, 2001-=-). In RSD, we use a constraint-based framework to handle the curse of dimensionality present in both procedural phases of RSD: first-order feature construction and subgroup discovery. We apply languag... |

30 |
To the international computing community: A new East-West challenge (Technical Report
- Michie, Muggleton, et al.
- 1994
(Show Context)
Citation Context ...tional data mining . Propositionalization . Feature construction . Subgroup discovery 1. Introduction Classical rule learning algorithms are designed to construct classification and prediction rules (=-=Michie et al., 1994-=-; Clark & Niblett, 1989; Cohen, 1995). The goal of these predictive induction algorithms is to induce classification/prediction models consisting of a set of rules. On the other hand, opposed to model... |

25 | Exploiting background knowledge in automated discovery
- Aronis, Provost, et al.
- 1996
(Show Context)
Citation Context ...s, such as size and accuracy constraints in decision trees (Garofalakis & Rastogi, 2000), rule induction with constraints in relational domains including propositionalization (Aronis & Provost, 1994; =-=Aronis et al., 1996-=-), and using rule sets to maximize the ROC performance (Fawcett, 2001). In RSD, we use a constraint-based framework to handle the curse of dimensionality present in both procedural phases of RSD: firs... |

25 | Learning Relational Concepts with Decision Trees - Geibel, Wysotzki - 1996 |

16 | Three companions for data mining in first order logic - Raedt, Blockeel, et al. - 2001 |

15 |
Explora: A Multipattern and Multistrategy Discovery Assistant
- Kloesgen
- 1996
(Show Context)
Citation Context ...titute Joˇzef Stefan, Ljubljana, Slovenia, and Nova Gorica Polytechnic, Nova Gorica, Slovenia e-mail: nada.lavrac@ijs.si Springers34 Mach Learn (2006) 62: 33–63 (Wrobel, 1997; Wrobel, 2001), EXPLORA (=-=Kloesgen, 1996-=-) and SubgroupMiner (Kloesgen & May, 2002)). This paper investigates relational subgroup discovery. As in the MIDOS relational subgroup discovery system, a subgroup discovery task is defined as follow... |

15 | Constructive Induction Using a Non-Greedy Strategy for Feature Selection - Oliveira, Sangiovanni-Vincentelli - 1992 |

15 |
Predicate invention in inductive logic programming
- Stahl
- 1996
(Show Context)
Citation Context ...1990; Cohen & Singer, 1991; Oliveira & SangiovanniVincentelli, 1992, Koller & Sahami, 1996;Geibel&Wysotzki,1996). A first-order counterpart of constructive induction is predicate invention (see e.g., =-=Stahl, 1996-=- for an overview of predicate invention in ILP). Springers38 Mach Learn (2006) 62: 33–63 Propositionalization (Lavrač &Dˇzeroski, 1994; Kramer et al., 2001) is a special case of predicate invention en... |

14 | Using the m-estimate in rule induction - Dˇzeroski, Cestnik, et al. - 1993 |

8 | Efficiently Constructing Relational Features from Background Knowledge for Inductive Machine Learning
- Aronis, Provost
- 1994
(Show Context)
Citation Context ... types of patterns/models, such as size and accuracy constraints in decision trees (Garofalakis & Rastogi, 2000), rule induction with constraints in relational domains including propositionalization (=-=Aronis & Provost, 1994-=-; Aronis et al., 1996), and using rule sets to maximize the ROC performance (Fawcett, 2001). In RSD, we use a constraint-based framework to handle the curse of dimensionality present in both procedura... |

8 | Discovering interesting exception rules with rule pair
- Suzuki
- 2004
(Show Context)
Citation Context ...ata has the form of ground Prolog facts and background knowledge is either in the form of facts or intensional rules, including functions and recursive predicate definitions. Exception rule learning (=-=Suzuki, 2004-=-) also deals with finding interesting population subgroups. Recent approaches to subgroup discovery, SD (Gamberger and Lavrač, 2002) and CN2-SD (Lavračetal.,2004), aim at overcoming the problem of ina... |

8 | size-complexity inductive logic programming: The EastWest Challenge considered as a problem in cost-sensitive classification
- Turney
- 1995
(Show Context)
Citation Context ...grades the propositionalization through first-order feature construction proposed by Flach and Lachiche (1999) and Lavrač and Flach (2001). Related approaches include feature construction in RL-ICET (=-=Turney, 1996-=-), stochastic predicate invention (Kramer et al., 1998) and predicate invention achieved by using a variety of predictive learning techniques to learn background knowledge predicate definitions (Srini... |

8 | Representation Changes for Efficient Learning in Structural Domains - Zucker - 1996 |

6 |
The Many Roles of Constraints in Data Mining
- Bayardo
- 2002
(Show Context)
Citation Context ...annila & Toivonen, 1997) involves the specification of a language of patterns and a set of constraints that a pattern has to satisfy with respect to a given database. In constraint-based data mining (=-=Bayardo, 2002-=-) the constraints that a pattern has to satisfy consist of language constraints and evaluation constraints. The first concern the pattern itself, while the second concern the validity of the pattern w... |

6 | A learning system for decision support in telecommunications - Zelezn´y, O - 2002 |

4 | Scalable data mining with model constraints
- Garofalakis, Rastogi
(Show Context)
Citation Context ...ng frequent episodes, Datalog queries, molecular fragments, etc. Few approaches exist that use constraints for other types of patterns/models, such as size and accuracy constraints in decision trees (=-=Garofalakis & Rastogi, 2000-=-), rule induction with constraints in relational domains including propositionalization (Aronis & Provost, 1994; Aronis et al., 1996), and using rule sets to maximize the ROC performance (Fawcett, 200... |

3 |
Learning decision lists. Machine Learning 2(3):229–246
- Rivest
- 1987
(Show Context)
Citation Context ...s 4 Alternatively, the Laplace estimate (Clark & Boswell, 1991) andthem-estimate (Cestnik, 1990; Dˇzeroski et al., 1993) could also be used. 5 When inducing an ordered list of rules (a decision list (=-=Rivest, 1987-=-)), the heuristic search procedure finds the best rule body for the current set of training examples, assigning the rule head to the most frequent class of the set of examples covered by the rule. Bef... |

2 |
A database perspective on knowledge discovery. Invited Talk at KDD'95
- Imielinsky
- 1995
(Show Context)
Citation Context ...f all the other subgroup descriptions. This evaluation method has been used in the experiments in this paper. 9 3.7. Constraint-based data mining framework for subgroup discovery Inductive databases (=-=Imielinsky & Mannila, 1996-=-) provide a database framework for knowledge discovery in which the definition of a data mining task (Mannila & Toivonen, 1997) involves the specification of a language of patterns and a set of constr... |

2 |
Census data mining—An application
- Klösgen, May
- 2002
(Show Context)
Citation Context ...ovenia, and Nova Gorica Polytechnic, Nova Gorica, Slovenia e-mail: nada.lavrac@ijs.si Springers34 Mach Learn (2006) 62: 33–63 (Wrobel, 1997; Wrobel, 2001), EXPLORA (Kloesgen, 1996) and SubgroupMiner (=-=Kloesgen & May, 2002-=-)). This paper investigates relational subgroup discovery. As in the MIDOS relational subgroup discovery system, a subgroup discovery task is defined as follows: Given a population of individuals and ... |

1 |
Hypothesis-driven constructive induction in AQ17: A method and experiments
- Cohen, Singer
- 1991
(Show Context)
Citation Context ...ning the idea of augmenting an existing set of attributes with new ones is known as constructive induction. The problem of feature construction has been studied extensively (Pagallo & Haussler, 1990; =-=Cohen & Singer, 1991-=-; Oliveira & SangiovanniVincentelli, 1992, Koller & Sahami, 1996;Geibel&Wysotzki,1996). A first-order counterpart of constructive induction is predicate invention (see e.g., Stahl, 1996 for an overvie... |

1 |
On the road to knowledge: Mining 21 years of UK Tra ∗∗ c Accedents Reports
- Flach, Mladenić, et al.
- 2003
(Show Context)
Citation Context ...of the input arguments. 6.3. The UK Traffic Accidents Domain The UK Traffic data set includes the records of all the accidents that happened on the roads of Great Britain between years 1979 and 1999 (=-=Flach et al., 2003-=-). It is a relational data set consisting of 3 related data tables: the ACCIDENT data, the VEHICLE data and the CASUALTY data. The ACCIDENT data consists of the records of all accidents that happened ... |

1 | Lavrač (2004). Analysis of example weighting in subgroup discoveryby comparison of three algorithms on a real-life data set - Kavˇsek |

1 |
Stochastic Propositionalizationof Non-determinate Background Knowledge
- Kramer, Pfahringer, et al.
- 1998
(Show Context)
Citation Context ...rder feature construction proposed by Flach and Lachiche (1999) and Lavrač and Flach (2001). Related approaches include feature construction in RL-ICET (Turney, 1996), stochastic predicate invention (=-=Kramer et al., 1998-=-) and predicate invention achieved by using a variety of predictive learning techniques to learn background knowledge predicate definitions (Srinivasan & King, 1996). Earlier approaches, that are clos... |

1 |
An extended transformation approach to inductivelogic programming
- Lavrač, Flach
- 2001
(Show Context)
Citation Context ...ker and Ganascia (1996, 1998) and Sebag and Rouveirol (1997). The RSD approach to first-order feature construction can be applied in the so-called individual-centered domains (Flach & Lachiche, 1999; =-=Lavrač & Flach, 2001-=-; Kramer et al., 2001), where there is a clear notion of individual, and learning occurs at the level of individuals only. For example, individual-centered domains include classification problems in m... |

1 | A study of relevance for learningin deductive databases - Lavrač, Gamberger, et al. - 1999 |