## Mr-SBC: a Multi-Relational Naive Bayes Classifier (2003)

### Cached

### Download Links

- [www.di.uniba.it]
- [www.cs.iastate.edu]
- [www.cs.iastate.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | Todorovski & H. Blockeel (Eds.), Knowledge Discovery in Databases PKDD 2003, Lecture Notes in Artificial Intelligence |

Citations: | 13 - 5 self |

### BibTeX

@INPROCEEDINGS{Ceci03mr-sbc:a,

author = {Michelangelo Ceci and Donato Malerba},

title = {Mr-SBC: a Multi-Relational Naive Bayes Classifier},

booktitle = {Todorovski & H. Blockeel (Eds.), Knowledge Discovery in Databases PKDD 2003, Lecture Notes in Artificial Intelligence},

year = {2003},

pages = {95--106},

publisher = {Springer}

}

### Years of Citing Articles

### OpenURL

### Abstract

Abstract. In this paper we propose an extension of the naïve Bayes classification method to the multi-relational setting. In this setting, training data are stored in several tables related by foreign key constraints and each example is represented by a set of related tuples rather than a single row as in the classical data mining setting. This work is characterized by three aspects. First, an integrated approach in the computation of the posterior probabilities for each class that make use of first order classification rules. Second, the applicability to both discrete and continuous attributes by means a supervised discretization. Third, the consideration of knowledge on the data model embedded in the database schema during the generation of classification rules. The proposed method has been implemented in the new system Mr-SBC, which is tightly integrated with a relational DBMS. Testing has been performed on two datasets and four benchmark tasks. Results on predictive accuracy and efficiency are in favour of Mr-SBC for the most complex tasks. 1

### Citations

653 |
K.B.: Multi-interval discretization of continuous-valued attributes for classification learning
- Fayyad, Irani
- 1993
(Show Context)
Citation Context ...method 1RD by Holte [14] for the induction of one-level decision trees, that proved to work well with the Naïve Bayes Classifier [4]. It is also different from the one-step method by Fayyad and Irani =-=[6]-=- that recursively splits the initial interval according to the class information entropy measure until a stopping criterion based on the Minimum Description Length (MDL) principle is verified. 5 The c... |

601 | On the Optimality of the Simple Bayesian Classifier under Zero-One Loss
- Domingos, Pazzani
- 1997
(Show Context)
Citation Context ... a training sample S={(x, y) ∈ X × Y | y=g(x) } as input and returns a function f which is hopefully close to g on the domain X. A well-known solution is represented by the Naïve Bayesian Classifiers =-=[3]-=-, which aim to classify any x∈X is the class maximizing the posterior probability P(Ci|x) that the observation x is of class Ci, that is: f(x)= arg maxi P(Ci|x) By applying the Bayes theorem, P(Ci|x) ... |

510 | Learning probabilistic relational models
- Getoor, Friedman, et al.
- 2001
(Show Context)
Citation Context ... data were originally structured is lost. Consequently, the (multi-)relational data mining approach has been receiving considerable attention in the literature, especially for the classification task =-=[1,10,15,20,7]-=-. In the traditional classification setting [18], data are generated independently and with an identical distribution from an unknown distribution P on some domain X and are labelled according to an u... |

438 | Very simple classification rules perform well on most commonly used datasets
- Holte
- 1993
(Show Context)
Citation Context ...a second step. Merging of two contiguous bins is performed when the increase of entropy is lower than a user-defined threshold (MAX_GAIN). This method is a variant of the one-step method 1RD by Holte =-=[14]-=- for the induction of one-level decision trees, that proved to work well with the Naïve Bayes Classifier [4]. It is also different from the one-step method by Fayyad and Irani [6] that recursively spl... |

408 | Supervised and unsupervised discretization of continuous features
- Dougherty, Kohavi, et al.
- 1995
(Show Context)
Citation Context ...defined threshold (MAX_GAIN). This method is a variant of the one-step method 1RD by Holte [14] for the induction of one-level decision trees, that proved to work well with the Naïve Bayes Classifier =-=[4]-=-. It is also different from the one-step method by Fayyad and Irani [6] that recursively splits the initial interval according to the class information entropy measure until a stopping criterion based... |

126 |
Top-down Induction of First Order Logical Decision Trees
- Blockeel
- 1998
(Show Context)
Citation Context ... data were originally structured is lost. Consequently, the (multi-)relational data mining approach has been receiving considerable attention in the literature, especially for the classification task =-=[1,10,15,20,7]-=-. In the traditional classification setting [18], data are generated independently and with an identical distribution from an unknown distribution P on some domain X and are labelled according to an u... |

52 | An experimental comparison of human and machine learning formalisms
- Muggleton, Bain, et al.
- 1989
(Show Context)
Citation Context ...ed on the Mutagenesis datasets and on Biodegradability datasets. 6.1 Results on Mutagenesis These datasets, taken from the MLNET repository, concern the problem of identifying the mutagenic compounds =-=[19]-=- and have been extensively used to test both inductive logic programming (ILP) systems and (multi-)relational mining systems. We considered, analogously to related experiments in the literature, the “... |

46 |
S.: Transformation-based learning using multirelational aggregation
- Krogel, Wrobel
- 2001
(Show Context)
Citation Context ... data were originally structured is lost. Consequently, the (multi-)relational data mining approach has been receiving considerable attention in the literature, especially for the classification task =-=[1,10,15,20,7]-=-. In the traditional classification setting [18], data are generated independently and with an identical distribution from an unknown distribution P on some domain X and are labelled according to an u... |

45 |
Attribute-Value Learning versus Inductive Logic Programming: The Missing Links
- Raedt
(Show Context)
Citation Context .... Although in principle it is possible to consider a single relation reconstructed by performing a relational join operation on the tables, this approach is fraught with many difficulties in practice =-=[2,11]-=-. It produces an extremely large, and impractical to handle, table with lots of data being repeated. A different approach is the construction of a single central relation that summarizes and/or aggreg... |

40 | Learning statistical models from relational data
- Getoor
- 2002
(Show Context)
Citation Context ...f data model that can help to guide the search process. This is an alternative to asking the users to specify a language bias, such as in 1BC or 1BC2. A different approach has been proposed by Getoor =-=[13]-=- where the Statistical Relational Models (SRM) are learnt taking advance from the tightly integration with a database. SRMs are models very similar to Bayesian Networks. The main difference is that th... |

39 |
Inductive logic programming for knowledge discovery in databases
- Wrobel
- 2001
(Show Context)
Citation Context ...n be represented as a single table, where each row corresponds to an example and each column to a predictor variable or to the target variable Y. This assumption, also known as singletable assumption =-=[23]-=-, seems quite restrictive in some data mining applications, where data are stored in a database and are organized into several tables for reasons of efficient storage and access. In this context, both... |

29 | Confirmation-guided discovery of first order rules with Tertius
- Flach, Lachiche
- 2000
(Show Context)
Citation Context ...eep the phases of first-order rules/conditions generation and of probability estimation separate. In particular, Pompe and Kononenko use ILP-R to induce first-order rules [21], while 1BC uses TERTIUS =-=[8]-=- to generate first order features. Then, the probabilities are computed for each first-order rule or feature. In the classification phase, the two approaches are similar to a multiple classifier becau... |

26 | MRDTL: A multi-relational decision tree learning algorithm
- Leiva
- 2002
(Show Context)
Citation Context ...s for Progol2, Foil, Tilde are taken from [1]. Results for Progol_1 are taken from [22]. The results for 1BC are taken from [9]. Results for 1BC2 are taken from [16]. Results for MRDTL are taken from =-=[17]-=-. The values are the results of 10-fold cross-validation. System Accuracy(%) BK0 BK1 BK2 Progol_1 79 86 86 Progol_2 76 81 86 Foil 61 61 83 Tilde 75 79 85 MRDTL 67 87 88 1BC2 72.9 --- 72.9 1BC 80.3 ---... |

23 | Naive Bayesian classifier within ILP-R
- Pompe, Kononenko
- 1995
(Show Context)
Citation Context |

20 |
1BC2: a true first-order Bayesian classifier
- Lachiche, Flach
- 2003
(Show Context)
Citation Context ... of its parts (e.g. a bond between two atoms). An elementary first-order feature consists of zero or more structural predicates and one property. An evolution of 1BC is represented by the system 1BC2 =-=[16]-=-, where no preliminary generation of first-order conditions is present. Predicates whose probabilities have to be estimated are dynamically defined on the basis of the individual to classify. Therefor... |

20 | S.: The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program
- Srinivasan, King, et al.
- 1999
(Show Context)
Citation Context ... systems and (multi-)relational mining systems. We considered, analogously to related experiments in the literature, the “regression friendly” dataset of 188 elements. A recent study on this database =-=[22]-=- recognizes five levels of background knowledge for mutagenesis which can provide richer descriptions of the examples. In this study we used only the first three levels of background knowledge in orde... |

16 | Multi-relational data mining using probabilistic relational models: research summary
- Getoor
- 2001
(Show Context)
Citation Context .... Although in principle it is possible to consider a single relation reconstructed by performing a relational join operation on the tables, this approach is fraught with many difficulties in practice =-=[2,11]-=-. It produces an extremely large, and impractical to handle, table with lots of data being repeated. A different approach is the construction of a single central relation that summarizes and/or aggreg... |

13 | Linear space induction in first order logic with ReliefF
- Pompe, Kononenko
- 1995
(Show Context)
Citation Context ...order naïve Bayes classifiers have already been reported in the literature. In particular, Pompe and Kononenko [20] proposed a method based on a two-step process. The first step uses the ILP-R system =-=[21]-=- to learn a hypothesis in the form of a set of first-order rules and then, in the second step, the rules are probabilistically analyzed. During the classification phase, the conditional probability di... |

9 | Learning Statistical Models of Relational Data - Getoor - 2001 |

4 | First-order bayesian classification with 1BC
- Flach, Lachiche
- 2000
(Show Context)
Citation Context ...arison on the set of 188 regression friendly elements of Mutagenesis. Results for Progol2, Foil, Tilde are taken from [1]. Results for Progol_1 are taken from [22]. The results for 1BC are taken from =-=[9]-=-. Results for 1BC2 are taken from [16]. Results for MRDTL are taken from [17]. The values are the results of 10-fold cross-validation. System Accuracy(%) BK0 BK1 BK2 Progol_1 79 86 86 Progol_2 76 81 8... |

3 | Decomposing probability distributions on structured individuals
- Flach, Lachiche
- 2000
(Show Context)
Citation Context |