Results 1  10
of
14
Separateandconquer rule learning
 Artificial Intelligence Review
, 1999
"... This paper is a survey of inductive rule learning algorithms that use a separateandconquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of ..."
Abstract

Cited by 135 (29 self)
 Add to MetaCart
This paper is a survey of inductive rule learning algorithms that use a separateandconquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
Structural Regression Trees
, 1996
"... In many realworld domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly nondeterm ..."
Abstract

Cited by 64 (10 self)
 Add to MetaCart
In many realworld domains the task of machine learning algorithms is to learn a theory predicting numerical values. In particular several standard test domains used in Inductive Logic Programming (ILP) are concerned with predicting numerical values from examples and relational and mostly nondeterminate background knowledge. However, so far no ILP algorithm except one can predict numbers and cope with nondeterminate background knowledge. (The only exception is a covering algorithm called FORS.) In this paper we present Structural Regression Trees (SRT), a new algorithm which can be applied to the above class of problems by integrating the statistical method of regression trees into ILP. SRT constructs a tree containing a literal (an atomic formula or its negation) or a conjunction of literals in each node, and assigns a numerical value to each leaf. SRT provides more comprehensible results than purely statistical methods, and can be applied to a class of problems most other ILP syste...
Topdown induction of logical decision trees
 Artificial Intelligence
, 1998
"... Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision trees, which make it possible to use TDIDT in inductive logic programming. An implementation of this topdown induction of logical decision trees, the Tilde system, is presented and experimentally evaluated. 1
Combining DivideandConquer and SeparateandConquer for Efficient and Effective Rule Induction
 Proc. of the Ninth International Workshop on Inductive Logic Programming, LNAI Series 1634
, 1999
"... . DivideandConquer (DAC) and SeparateandConquer (SAC) are two strategies for rule induction that have been used extensively. When searching for rules DAC is maximally conservative w.r.t. decisions made during search for previous rules. This results in a very efficient strategy, which however ..."
Abstract

Cited by 12 (2 self)
 Add to MetaCart
. DivideandConquer (DAC) and SeparateandConquer (SAC) are two strategies for rule induction that have been used extensively. When searching for rules DAC is maximally conservative w.r.t. decisions made during search for previous rules. This results in a very efficient strategy, which however suffers from difficulties in effectively inducing disjunctive concepts due to the replication problem. SAC on the other hand is maximally liberal in the same respect. This allows for a larger hypothesis space to be searched, which in many cases avoids the replication problem but at the cost of lower efficiency. We present a hybrid strategy called ReconsiderandConquer (RAC), which handles the replication problem more effectively than DAC by reconsidering some of the earlier decisions and allows for more efficient induction than SAC by holding on to some of the decisions. We present experimental results from propositional, numerical and relational domains demonstrating that RAC si...
TheoryGuided Induction of Logic Programs by Inference of Regular Languages
 Proc. of the 13th International Conference on Machine Learning
, 1996
"... Previous resolutionbased approaches to theoryguided induction of logic programs produce hypotheses in the form of a set of resolvents of a theory, where the resolvents represent allowed sequences of resolution steps for the initial theory. There are, however, many characterizations of allowe ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
Previous resolutionbased approaches to theoryguided induction of logic programs produce hypotheses in the form of a set of resolvents of a theory, where the resolvents represent allowed sequences of resolution steps for the initial theory. There are, however, many characterizations of allowed sequences of resolution steps that cannot be expressed by a set of resolvents. One approach to this problem is presented, the system merlin, which is based on an earlier technique for learning finitestate automata that represent allowed sequences of resolution steps. merlin extends the previous technique in three ways: i) negative examples are considered in addition to positive examples, ii) a new strategy for performing generalization is used, and iii) a technique for converting the learned automaton to a logic program is included. Results from experiments are presented in which merlin outperforms both a system using the old strategy for performing generalization, and a t...
Induction of Logic Programs by Exampleguided Unfolding
 Journal of Logic Programming
, 1999
"... Resolution has been used as a specialisation operator in several approaches to topdown induction of logic programs. This operator allows the overly general hypothesis to be used as a declarative bias that restricts not only what predicate symbols can be used in produced hypotheses, but also how the ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Resolution has been used as a specialisation operator in several approaches to topdown induction of logic programs. This operator allows the overly general hypothesis to be used as a declarative bias that restricts not only what predicate symbols can be used in produced hypotheses, but also how the predicates can be invoked. The two main strategies for topdown induction of logic programs, Covering and DivideandConquer, are formalised using resolution as a specialisation operator, resulting in two strategies for performing exampleguided unfolding. These strategies are compared both theoretically and experimentally. It is shown that the computational cost grows quadratically in the size of the example set for Covering, while it grows linearly for DivideandConquer. This is also demonstrated by experiments, in which the amount ofwork performed by Covering is up to 30 times the amount ofwork performed by DivideandConquer. The theoretical analysis shows that the hypothesis space is larger for Covering, and thus more compact hypotheses may be found by this technique than by DivideandConquer. However, it is shown that for each nonrecursive hypothesis that can be produced by Covering, there is an equivalent hypothesis (w.r.t. the background predicates) that can be produced by DivideandConquer. A major drawback of DivideandConquer, in contrast to Covering, is that it is not applicable to learning recursive de nitions. 1 1
Induction in first order logic from noisy training examples and fixed example set size
 In PhD Thesis
, 1999
"... Abstract This dissertation investigates the field of inductive logic programming (ILP) and in so doing an ILP system, Lime, is designed and developed. Lime addresses the problem of noisy training examples; learning from only positive, only negative, or both positive and negative examples; efficientl ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
Abstract This dissertation investigates the field of inductive logic programming (ILP) and in so doing an ILP system, Lime, is designed and developed. Lime addresses the problem of noisy training examples; learning from only positive, only negative, or both positive and negative examples; efficiently biasing and searching the hypothesis space; and handling recursion efficiently and effectively. The Qheuristic is introduced to address the problem of learning with both noisy training examples and fixed numbers of positive and negative training examples. This heuristics is based on Bayes rule. Both a justification of its derivation and a description of the context in which it is appropriately applied are given. Because of the general nature of this heuristic its application is not restricted to ILP. Instead of employing a greedy covering approach to constructing clauses, Lime employs the Qheuristic to evaluate entire logic programs as hypotheses. To tame the inevitable explosion in the search space, the notion of a simple clause is introduced. These sets of literals may be viewed as subparts of clauses that are effectively independent in terms of variables used. Instead of growing a clause one literal at a time, Lime efficiently combines simple clauses to construct a set of gainful candidate clauses. Subsets of these candidate clauses are evaluated using the Qheuristic to find the final hypothesis. Details of the algorithms and data structures of Lime are discussed. Lime's handling of recursive logic programs is also described. Experimental results are provided to illustrate how Lime achieves its design goals of better noise handling, learning from a fixed set of examples (e.g., from only positive data), and of learning recursive logic programs. These results compare the performance of Lime with other leading ILP systems like Foil and Progol in a variety of domains. Empirical results with a boosted version of Lime are also reported.
Hyperrectanglebased discriminative data generalization and applications in data mining
, 2007
"... The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as humancomprehensible patterns from which endusers can gain intuitions and insights. Axisparallel hyperrectangles provide interpretable generalizations for multidimensional data points ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
The ultimate goal of data mining is to extract knowledge from massive data. Knowledge is ideally represented as humancomprehensible patterns from which endusers can gain intuitions and insights. Axisparallel hyperrectangles provide interpretable generalizations for multidimensional data points with numerical attributes. In this dissertation, we study the fundamental problem of rectanglebased discriminative data generalization in the context of several useful data mining applications: cluster description, rule learning, and Nearest Rectangle classification. Clustering is one of the most important data mining tasks. However, most clustering methods output sets of points as clusters and do not generalize them into interpretable patterns. We perform a systematic study of cluster description, where we propose novel description formats leading to enhanced expressive power and introduce novel description problems specifying different tradeoffs between interpretability and accuracy. We also present efficient heuristic algorithms for the introduced problems in the proposed formats. Ifthen rules are
Evaluation Measures for Multiclass Subgroup Discovery
"... www.cs.bristol.ac.uk/˜dawood/ www.cs.bristol.ac.uk/˜flach/ Abstract. Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in a twoclass context. This paper inve ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
www.cs.bristol.ac.uk/˜dawood/ www.cs.bristol.ac.uk/˜flach/ Abstract. Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in a twoclass context. This paper investigates multiclass subgroup discovery methods. We consider six evaluation measures for multiclass subgroups, four of them new, and study their theoretical properties. We extend the twoclass subgroup discovery algorithm CN2SD to incorporate the new evaluation measures and a new weighting scheme inspired by AdaBoost. We demonstrate the usefulness of multiclass subgroup discovery experimentally, using discovered subgroups as features for a decision tree learner. Not only is the number of leaves of the decision tree reduced with a factor between 8 and 16 on average, but significant improvements in accuracy and AUC are achieved with particular evaluation measures and settings. Similar performance improvements can be observed when using naive Bayes. 1
Anytime Inductive Logic Programming
 In Proceedings of the 15th International Conference on Computers and Their Applications
, 2000
"... Anytime algorithms refers to algorithms that \always " can produce a result. Often the result of the algorithm depends on the time at hand, the longer the time, the better the answer. In this paper we present an easy way of turning regular Inductive Logic Programming (ILP) algorithms such as Divide ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Anytime algorithms refers to algorithms that \always " can produce a result. Often the result of the algorithm depends on the time at hand, the longer the time, the better the answer. In this paper we present an easy way of turning regular Inductive Logic Programming (ILP) algorithms such as DivideAndConquer (DAC) and SeparateAndConquer (SAC) into anytime algorithms. We conduct experiments with these anytime algorithms and introduce a simple heuristic called squared quota, that we compare with an established one, information gain. It seems that squared quota is better suited for a small window size of example data, and hence better to use in anytime systems. A comparison between SAC and DAC reveals that they excel in dierent combinations of examples /background knowledge. 1 Introduction The area of articial intelligence (AI) that studies learning is called machine learning. According to [4] there are four main approaches to machine learning: decision trees, neural networks, ge...