Results 1  10
of
12
Topdown induction of clustering trees
 In 15th Int’l Conf. on Machine Learning
, 1998
"... An approach to clustering is presented that adapts the basic topdown induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for firs ..."
Abstract

Cited by 99 (22 self)
 Add to MetaCart
An approach to clustering is presented that adapts the basic topdown induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for first order clustering. The TIC system employs the first order logical decision tree representation of the inductive logic programming system Tilde. Various experiments with TIC are presented, in both propositional and relational domains. 1
Mining Association Rules in Multiple Relations
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... . The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the fi ..."
Abstract

Cited by 81 (8 self)
 Add to MetaCart
. The application of algorithms for efficiently generating association rules is so far restricted to cases where information is put together in a single relation. We describe how this restriction can be overcome through the combination of the available algorithms with standard techniques from the field of inductive logic programming. We present the system Warmr, which extends Apriori [2] to mine association rules in multiple relations. We apply Warmr to the natural language processing task of mining partofspeech tagging rules in a large corpus of English. Keywords: association rules, inductive logic programming 1 Introduction Association rules are generally recognized as a highly valuable type of regularities and various algorithms have been presented for efficiently mining them in large databases (cf. [1, 7, 2]). To the best of our knowledge, the application of these algorithms is so far restricted to cases where information is put together in a single relation. We describe how th...
Maximum Entropy Modeling with Clausal Constraints
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... We present the learning system Maccent which addresses the novel task of stochastic MAximum ENTropy modeling with Clausal Constraints. Maximum Entropy method is a Bayesian method based on the principle that the target stochastic model should be as uniform as possible, subject to known constraints. ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
We present the learning system Maccent which addresses the novel task of stochastic MAximum ENTropy modeling with Clausal Constraints. Maximum Entropy method is a Bayesian method based on the principle that the target stochastic model should be as uniform as possible, subject to known constraints. Maccent incorporates clausal constraints that are based on the evaluation of Prolog clauses in examples represented as Prolog programs. We build on an existing maximumlikelihood approach to maximum entropy modeling, which we upgrade along two dimensions: (1) Maccent can handle larger search spaces, due to a partial ordering defined on the space of clausal constraints, and (2) uses a richer firstorder logic format. In comparison with other inductive logic programming systems, Maccent seems to be the first that explicitly constructs a conditional probability distribution p(CjI) based on an empirical distribution ~ p(CjI) (where p(CjI) (~p(CjI)) gives the induced (observed) probability of ...
Topdown induction of logical decision trees
 Artificial Intelligence
, 1998
"... Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision trees, which make it possible to use TDIDT in inductive logic programming. An implementation of this topdown induction of logical decision trees, the Tilde system, is presented and experimentally evaluated. 1
Lookahead and Discretization in ILP
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... . We present and evaluate two methods for improving the performance of ILP systems. One of them is discretization of numerical attributes, based on Fayyad and Irani's text [9], but adapted and extended in such a way that it can cope with some aspects of discretization that only occur in relational l ..."
Abstract

Cited by 27 (10 self)
 Add to MetaCart
. We present and evaluate two methods for improving the performance of ILP systems. One of them is discretization of numerical attributes, based on Fayyad and Irani's text [9], but adapted and extended in such a way that it can cope with some aspects of discretization that only occur in relational learning problems (when indeterminate literals occur). The second technique is lookahead. It is a wellknown problem in ILP that a learner cannot always assess the quality of a refinement without knowing which refinements will be enabled afterwards, i.e. without looking ahead in the refinement lattice. We present a simple method for specifying when lookahead is to be used, and what kind of lookahead is interesting. Both the discretization and lookahead techniques are evaluated experimentally. The results show that both techniques improve the quality of the induced theory, while computational costs are acceptable. 1 Introduction Propositional learning has been studied much more extensively th...
Using Logical Decision Trees for Clustering
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experimen ..."
Abstract

Cited by 22 (2 self)
 Add to MetaCart
A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experiments are discussed, which show the promise of the approach. 1 Introduction A decision tree is usually seen as representing a theory for classification of examples. If the examples are positive and negative examples for one specific concept, then the tree defines these two concepts. One could also say, if there are k classes, that the tree defines k concepts. Another viewpoint is taken in Langley's Elements of Machine Learning [ Langley, 1996 ] . Langley sees decision tree induction as a special case of the induction of concept hierarchies. A concept is associated with each node of the tree, and as such the tree represents a kind of taxonomy, a hierarchy of many concepts. This is very similar...
On Multiclass Problems and Discretization in Inductive Logic Programming
, 1997
"... . In practical applications of machine learning and knowledge discovery, handling multiclass problems and real numbers are important issues. While attributevalue learners address these problems as a rule, very few ILP systems do so. The few ILP systems that handle real numbers mostly do so by tryi ..."
Abstract

Cited by 11 (6 self)
 Add to MetaCart
. In practical applications of machine learning and knowledge discovery, handling multiclass problems and real numbers are important issues. While attributevalue learners address these problems as a rule, very few ILP systems do so. The few ILP systems that handle real numbers mostly do so by trying out all real values applicable, thus running into efficiency or overfitting problems. The ILP learner ICL (Inductive Constraint Logic), learns first order logic formulae from positive and negative examples. The main characteristic of ICL is its view on examples, which are seen as interpretations which are true or false for the target theory. The paper reports on the extensions of ICL to tackle multiclass problems and real numbers. We also discuss some issues on learning CNF formulae versus DNF formulae related to these extensions. Finally, we present experiments in the practical domains of predicting mutagenesis, finite element mesh design and predicting biodegradability of chemical comp...
Frequent query discovery: a unifying ILP approach to association rule mining
, 1998
"... Discovery of frequent patterns has been studied in a variety of data mining (DM) settings. In its simplest form, known from association rule mining, the task is to find all frequent itemsets, i.e., to list all combinations of items that are found in a sufficient number of examples. A similar task in ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Discovery of frequent patterns has been studied in a variety of data mining (DM) settings. In its simplest form, known from association rule mining, the task is to find all frequent itemsets, i.e., to list all combinations of items that are found in a sufficient number of examples. A similar task in spirit, but at the opposite end of the complexity scale, is the Inductive Logic Programming (ILP) approach where the goal is to discover queries in first order logic that succeed with respect to a sufficient number of examples. We discuss the relationship of ILP to frequent pattern discovery. On one hand, our goal is to relate data mining problems to ILP. On another hand, we want to demonstrate how ILP can be used to solve both existing and new data mining problems. The fundamental task of association rule and frequent set discovery has been extended in various directions, allowing more useful patterns to be discovered. From an ILP viewpoint, however, it can be argued that these settings ar...
Dimensionality Reduction in ILP: A Call To Arms
"... The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle largescale applications that are currently beyond their scope. One way to address this problem is to use techniques for reducing the dimensionality of the learning ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle largescale applications that are currently beyond their scope. One way to address this problem is to use techniques for reducing the dimensionality of the learning problem by reducing the hypothesis space and/or reducing the example space. While research in machine learning has devoted considerable attention to such techniques, they have so far been neglected in ILP research. The purpose of this paper is to motivate research in this area and to present some results on windowing techniques. 1 Introduction One of the most often heard prejudices against ILP algorithms is that they are only applicable to toy problems and will not scale up to applications of significant size. While it is our firm belief that the order of magnitude of this unspecified "significant size" is monotonicly increasing in order to keep the argument alive, it is nevertheless indis...
Mining a Natural Language Corpus for MultiRelational Association Rules
, 1997
"... Association rules are generally recognized as a highly valuable type of regularities and various algorithms have been presented for efficiently mining them in large databases. To the best of our knowledge, the application of these algorithms is so far restricted to databases that consist of a single ..."
Abstract

Cited by 4 (2 self)
 Add to MetaCart
Association rules are generally recognized as a highly valuable type of regularities and various algorithms have been presented for efficiently mining them in large databases. To the best of our knowledge, the application of these algorithms is so far restricted to databases that consist of a single relation composed of a set of binary attributes. We describe how these restrictions can be overcome through the combination of the available algorithms with standard techniques from the field of inductive logic programming. We present the algorithm AprioriRel, which extends Apriori [ Agrawal et al., 1996 ] to mine association rules in multiple relations. Whereas in Apriori each example is described by means of a single tuple, in AprioriRel each example is viewed as a separate database with a selection, from multiple relations, of all tuples related to the example. Accordingly, the association rules discovered by AprioriRel may combine information from various relations to statements of th...