Results 1  10
of
31
Theories for Mutagenicity: A Study in FirstOrder and FeatureBased Induction
 Artificial Intelligence
, 1996
"... A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler featurebased methods. One area that has long been asso ..."
Abstract

Cited by 150 (30 self)
 Add to MetaCart
A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler featurebased methods. One area that has long been associated with graphbased or structural representation and reasoning is organic chemistry. In this field, we consider the problem of predicting the mutagenic activity of small molecules: a property that is related to carcinogenicity, and an important consideration in developing less hazardous drugs. By providing an ILP system with progressively more structural information concerning the molecules, we compare the predictive power of the logical theories constructed against benchmarks set by regression, neural, and treebased methods. 1 Introduction Constructing theories to explain observations occupies much of the creative hours of scientists and engineers. Programs from the field of Inductiv...
Topdown induction of clustering trees
 In 15th Intâ€™l Conf. on Machine Learning
, 1998
"... An approach to clustering is presented that adapts the basic topdown induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for firs ..."
Abstract

Cited by 99 (22 self)
 Add to MetaCart
An approach to clustering is presented that adapts the basic topdown induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for first order clustering. The TIC system employs the first order logical decision tree representation of the inductive logic programming system Tilde. Various experiments with TIC are presented, in both propositional and relational domains. 1
Structural Regression Trees
, 1996
"... In many realworld domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly nondeterm ..."
Abstract

Cited by 64 (10 self)
 Add to MetaCart
In many realworld domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly nondeterminate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems by integrating the statistical method of regression trees into ILP. SRT constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP syste...
Improving the efficiency of inductive logic programming through the use of query packs
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2002
"... Inductive logic programming, or relational learning, is a powerful paradigm for machine learning or data mining. However, in order for ILP to become practically useful, the efficiency of ILP systems must improve substantially. To this end, the notion of a query pack is introduced: it structures sets ..."
Abstract

Cited by 57 (19 self)
 Add to MetaCart
Inductive logic programming, or relational learning, is a powerful paradigm for machine learning or data mining. However, in order for ILP to become practically useful, the efficiency of ILP systems must improve substantially. To this end, the notion of a query pack is introduced: it structures sets of similar queries. Furthermore, a mechanism is described for executing such query packs. A complexity analysis shows that considerable efficiency improvements can be achieved through the use of this query pack execution mechanism. This claim is supported by empirical results obtained by incorporating support for query pack execution in two existing learning systems.
Relational DistanceBased Clustering
, 1998
"... Work on firstorder clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on ..."
Abstract

Cited by 32 (0 self)
 Add to MetaCart
Work on firstorder clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on distance measures can often outperform their conceptbased counterparts. In this paper, we therefore build on recent advances in the area of #rstorder distance metrics and present RDBC, a bottomup agglomerative clustering algorithm for #rstorder representations that relies on distance information only and features a novel parameterfree pruning measure for selecting the #nal clustering from the cluster tree. The algorithm can empirically be shown to produce good clusterings #on the mutagenesis domain# that, when used for subsequent prediction tasks, improve on previous clustering results and approach the accuracies of dedicated predictive learners.
Topdown induction of logical decision trees
 Artificial Intelligence
, 1998
"... Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision trees, which make it possible to use TDIDT in inductive logic programming. An implementation of this topdown induction of logical decision trees, the Tilde system, is presented and experimentally evaluated. 1
Prediction of Ordinal Classes Using Regression Trees
, 2001
"... This paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with SCART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm ..."
Abstract

Cited by 29 (0 self)
 Add to MetaCart
This paper is devoted to the problem of learning to predict ordinal (i.e., ordered discrete) classes using classification and regression trees. We start with SCART, a tree induction algorithm, and study various ways of transforming it into a learner for ordinal classification tasks. These algorithm variants are compared on a number of benchmark data sets to verify the relative strengths and weaknesses of the strategies and to study the tradeoff between optimal categorical classification accuracy (hit rate) and minimum distancebased error. Preliminary results indicate that this is a promising avenue towards algorithms that combine aspects of classification and regression.
Lookahead and Discretization in ILP
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... . We present and evaluate two methods for improving the performance of ILP systems. One of them is discretization of numerical attributes, based on Fayyad and Irani's text [9], but adapted and extended in such a way that it can cope with some aspects of discretization that only occur in relational l ..."
Abstract

Cited by 27 (10 self)
 Add to MetaCart
. We present and evaluate two methods for improving the performance of ILP systems. One of them is discretization of numerical attributes, based on Fayyad and Irani's text [9], but adapted and extended in such a way that it can cope with some aspects of discretization that only occur in relational learning problems (when indeterminate literals occur). The second technique is lookahead. It is a wellknown problem in ILP that a learner cannot always assess the quality of a refinement without knowing which refinements will be enabled afterwards, i.e. without looking ahead in the refinement lattice. We present a simple method for specifying when lookahead is to be used, and what kind of lookahead is interesting. Both the discretization and lookahead techniques are evaluated experimentally. The results show that both techniques improve the quality of the induced theory, while computational costs are acceptable. 1 Introduction Propositional learning has been studied much more extensively th...
Using Logical Decision Trees for Clustering
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experimen ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experiments are discussed, which show the promise of the approach. 1 Introduction A decision tree is usually seen as representing a theory for classification of examples. If the examples are positive and negative examples for one specific concept, then the tree defines these two concepts. One could also say, if there are k classes, that the tree defines k concepts. Another viewpoint is taken in Langley's Elements of Machine Learning [ Langley, 1996 ] . Langley sees decision tree induction as a special case of the induction of concept hierarchies. A concept is associated with each node of the tree, and as such the tree represents a kind of taxonomy, a hierarchy of many concepts. This is very similar...
The role of background knowledge: using a problem from chemistry to examine the performance of an ILP program
, 1996
"... Inductive Logic Programming (ILP) systems construct explanations for data in terms of domainspecific background information. How does the quality of this information affect the performance of an ILP system? Results from experiments concerned with learning simple programs for list processing suggest ..."
Abstract

Cited by 20 (2 self)
 Add to MetaCart
Inductive Logic Programming (ILP) systems construct explanations for data in terms of domainspecific background information. How does the quality of this information affect the performance of an ILP system? Results from experiments concerned with learning simple programs for list processing suggest that performance is sensitive to the type and amount of background knowledge provided. In particular, background knowledge that contains large amounts of information that is known to be irrelevant to the problem being considered can, and typically does, prevent an ILP system in its search for an correct explanation.