Results 11 - 20
of
60
Multi-Relational Decision Tree Induction
- In Proceedings of PKDD’ 99, Prague, Czech Republic, Septembre
, 1999
"... Discovering decision trees is an important set of techniques in KDD, both because of their simple interpretation and the efficiency of their discovery. One of their disadvantages is that they do not take the structure of the mining object into account. By going from the standard single-relation appr ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Discovering decision trees is an important set of techniques in KDD, both because of their simple interpretation and the efficiency of their discovery. One of their disadvantages is that they do not take the structure of the mining object into account. By going from the standard single-relation approach to the multi-relational approach as in ILP this disadvantage is removed. However, the straightforward generalization loses the efficiency of the standard algorithms. In this paper we present a framework that allows the efficient discovery of multi-relational decision trees through the exploitation of the domain knowledge encoded in the data model of the database. Introduction The induction of decision trees has been getting a lot of attention in the field of Knowledge Discovery in Databases over the past few years. This popularity has been largely due to the efficiency with which decision trees can be induced from large datasets, as well as to the elegant and intuitive representation ...
Hierarchical Multi-Classification
, 2002
"... The problem of hierarchical multi-classification is considered. ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The problem of hierarchical multi-classification is considered.
Relational Reinforcement Learning
- Multi-Agent Systems and Applications, 9th ECCAI Advanced Course ACAI 2001 and Agent Link’s 3rd European Agent Systems Summer School (EASSS 2001), volume 2086 of Lecture Notes in Computer Science
, 2001
"... This paper presents an introduction to reinforcement learning and relational reinforcement learning at a level to be understood by students and researchers with different backgrounds. ..."
Abstract
-
Cited by 17 (1 self)
- Add to MetaCart
This paper presents an introduction to reinforcement learning and relational reinforcement learning at a level to be understood by students and researchers with different backgrounds.
Integrating Declarative Knowledge in Hierarchical Clustering Tasks
- Proceedings of the International Symposium on Intelligent Data Analysis
, 1999
"... The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsupervised learning task, clustering appears to be one of the tasks that more benefits might obtain from prior knowledge. In this paper, we propose a method for providing declara ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
The capability of making use of existing prior knowledge is an important challenge for Knowledge Discovery tasks. As an unsupervised learning task, clustering appears to be one of the tasks that more benefits might obtain from prior knowledge. In this paper, we propose a method for providing declarative prior knowledge to a hierarchical clustering system stressing the interactive component. Preliminary results suggest that declarative knowledge is a powerful bias in order to improve the quality of clustering in domains were the internal biases of the system are inappropriate or there is not enough evidence in data and that it can lead the system to build more comprehensible clusterings.
Efficient Algorithms for Decision Tree Cross-validation
- Journal of Machine Learning Research
, 2002
"... Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the co ..."
Abstract
-
Cited by 11 (3 self)
- Add to MetaCart
Cross-validation is a useful and generally applicable technique often employed in machine learning, including decision tree induction. An important disadvantage of straightforward implementation of the technique is its computational overhead. In this paper we show that, for decision trees, the computational overhead of cross-validation can be reduced significantly by integrating the cross-validation with the normal decision tree induction process. We discuss how existing decision tree algorithms can be adapted to this aim, and provide an analysis of the speedups these adaptations may yield. We identify a number of parameters that influence the obtainable speedups, and validate and refine our analysis with experiments on a variety of data sets with two different implementations.
Hierarchical Multi-classification with Predictive Clustering Trees in Functional Genomics
- the Workshop on Computational Methods in Bioinformatics at the 12th Portuguese Conference on Artificial Intelligence
, 2005
"... This paper investigates how predictive clustering trees can be used to predict gene function in the genome of the yeast Saccharomyces cerevisiae. We consider the MIPS FunCat classification scheme, in which each gene is annotated with one or more classes selected from a given functional class hi ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
This paper investigates how predictive clustering trees can be used to predict gene function in the genome of the yeast Saccharomyces cerevisiae. We consider the MIPS FunCat classification scheme, in which each gene is annotated with one or more classes selected from a given functional class hierarchy. This setting presents two important challenges to machine learning: (1) each instance is labeled with a set of classes instead of just one class, and (2) the classes are structured in a hierarchy; ideally the learning algorithm should also take this hierarchical information into account. Predictive clustering trees generalize decision trees and can be applied to a wide range of prediction tasks by plugging in a suitable distance metric. We define an appropriate distance metric for hierarchical multi-classification and present experiments evaluating this approach on a number of data sets that are available for yeast.
Constraint Based Induction of Multi-Objective Regression Trees
- In proceedings of the 4th International Workshop on Knowledge Discovery in Inductive Databases
, 2005
"... Constrained based inductive systems are a key component of inductive databases and responsible for building the models that satisfy the constraints in the inductive queries. In this paper, we propose a constraint based system for building multi-objective regression trees. ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
Constrained based inductive systems are a key component of inductive databases and responsible for building the models that satisfy the constraints in the inductive queries. In this paper, we propose a constraint based system for building multi-objective regression trees.
Kernelizing the output of tree-based methods
- In Proceedings of the 23rd International Conference on Machine Learning Edited by: Cohen W, Moore A. ACM
, 2006
"... We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting algorithm, called output kernel trees (OK3), generalizes classification and regression trees ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting algorithm, called output kernel trees (OK3), generalizes classification and regression trees as well as treebased ensemble methods in a principled way. It inherits several features of these methods such as interpretability, robustness to irrelevant variables, and input scalability. When only the Gram matrix over the outputs of the learning sample is given, it learns the output kernel as a function of inputs. We show that the proposed algorithm works well on an image reconstruction task and on a biological network inference problem. 1.
Learning Predictive Clustering Rules
- In 4th Int’l Workshop on Knowledge Discovery in Inductive Databases: Revised Selected and Invited Papers, volume 3933 of LNCS
, 2005
"... The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. We propose a novel approach to predictive clustering called predictive clustering rul ..."
Abstract
-
Cited by 7 (3 self)
- Add to MetaCart
The two most commonly addressed data mining tasks are predictive modelling and clustering. Here we address the task of predictive clustering, which contains elements of both and generalizes them to some extent. We propose a novel approach to predictive clustering called predictive clustering rules, present an initial implementation and its preliminary experimental evaluation.
Automatic Construction and Refinement of a Class Hierarchy over Semistructured Data
, 2001
"... this paper, we present an approach based on the use of two languages of description of classes for the automatic clustering of semistructured data. The first language of classes has a high power of abstraction and guides the construction of a lattice of classes covering the whole set of the data. Th ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
this paper, we present an approach based on the use of two languages of description of classes for the automatic clustering of semistructured data. The first language of classes has a high power of abstraction and guides the construction of a lattice of classes covering the whole set of the data. The second language of classes, more expressive and more precise, is the basis for the refinement of a part of the lattice that the user wants to focus on. Our approach has been implemented and experimented on real data in the setting of the GAEL project which aims at building flexible electronic catalogs organized as a hierarchy of classes of products. Our experiments have been conducted on real data coming from the C/Net (http://www.cnet.com) electronic catalog of computer products

