Results 1  10
of
105
Learning Stochastic Logic Programs
, 2000
"... Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder range ..."
Abstract

Cited by 1057 (71 self)
 Add to MetaCart
Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic contextfree grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a firstorder rangerestricted definite clause. This paper summarises the syntax, distributional semantics and proof techniques for SLPs and then discusses how a standard Inductive Logic Programming (ILP) system, Progol, has been modied to support learning of SLPs. The resulting system 1) nds an SLP with uniform probability labels on each definition and nearmaximal Bayes posterior probability and then 2) alters the probability labels to further increase the posterior probability. Stage 1) is implemented within CProgol4.5, which differs from previous versions of Progol by allowing userdefined evaluation functions written in Prolog. It is shown that maximising the Bayesian posterior function involves nding SLPs with short derivations of the examples. Search pruning with the Bayesian evaluation function is carried out in the same way as in previous versions of CProgol. The system is demonstrated with worked examples involving the learning of probability distributions over sequences as well as the learning of simple forms of uncertain knowledge.
Learning logical definitions from relations
 MACHINE LEARNING
, 1990
"... Abstract. This paper describes FOIL, a system that learns Horn clauses from data expressed as relations. FOIL is based on ideas that have proved effective in attributevalue learning systems, but extends them to a firstorder formalism. This new system has been applied successfully to several tasks ..."
Abstract

Cited by 856 (8 self)
 Add to MetaCart
Abstract. This paper describes FOIL, a system that learns Horn clauses from data expressed as relations. FOIL is based on ideas that have proved effective in attributevalue learning systems, but extends them to a firstorder formalism. This new system has been applied successfully to several tasks taken from the machine learning literature.
FOIL: A Midterm Report
 In Proceedings of the European Conference on Machine Learning
, 1993
"... : FOIL is a learning system that constructs Horn clause programs from examples. This paper summarises the development of FOIL from 1989 up to early 1993 and evaluates its effectiveness on a nontrivial sequence of learning tasks taken from a Prolog programming text. Although many of these tasks ..."
Abstract

Cited by 212 (3 self)
 Add to MetaCart
: FOIL is a learning system that constructs Horn clause programs from examples. This paper summarises the development of FOIL from 1989 up to early 1993 and evaluates its effectiveness on a nontrivial sequence of learning tasks taken from a Prolog programming text. Although many of these tasks are handled reasonably well, the experiment highlights some weaknesses of the current implementation. Areas for further research are identified. 1. Introduction The principal differences between zerothorder and firstorder supervised learning systems are the form of the training data and the way that a learned theory is expressed. Data for zerothorder learning programs such as ASSISTANT [Cestnik, Kononenko and Bratko, 1986], CART [Breiman, Friedman, Olshen and Stone, 1984], CN2 [Clark and Niblett, 1987] and C4.5 [Quinlan, 1992] comprise preclassified cases, each described by its values for a fixed collection of attributes. These systems develop theories, in the form of decision trees o...
Interpreting Bayesian Logic Programs
 PROCEEDINGS OF THE WORKINPROGRESS TRACK AT THE 10TH INTERNATIONAL CONFERENCE ON INDUCTIVE LOGIC PROGRAMMING
, 2001
"... Various proposals for combining first order logic with Bayesian nets exist. We introduce the formalism of Bayesian logic programs, which is basically a simplification and reformulation of Ngo and Haddawys probabilistic logic programs. However, Bayesian logic programs are sufficiently powerful to ..."
Abstract

Cited by 109 (7 self)
 Add to MetaCart
Various proposals for combining first order logic with Bayesian nets exist. We introduce the formalism of Bayesian logic programs, which is basically a simplification and reformulation of Ngo and Haddawys probabilistic logic programs. However, Bayesian logic programs are sufficiently powerful to represent essentially the same knowledge in a more elegant manner. The elegance is illustrated by the fact that they can represent both Bayesian nets and definite clause programs (as in "pure" Prolog) and that their kernel in Prolog is actually an adaptation of an usual Prolog metainterpreter.
Automated Refinement of FirstOrder HornClause Domain Theories
 MACHINE LEARNING
, 1995
"... Knowledge acquisition is a difficult, errorprone, and timeconsuming task. The task of automatically improving an existing knowledge base using learning methods is addressed by the class of systems performing theory refinement. This paper presents a system, Forte (FirstOrder Revision of Theories f ..."
Abstract

Cited by 81 (7 self)
 Add to MetaCart
Knowledge acquisition is a difficult, errorprone, and timeconsuming task. The task of automatically improving an existing knowledge base using learning methods is addressed by the class of systems performing theory refinement. This paper presents a system, Forte (FirstOrder Revision of Theories from Examples), which refines firstorder Hornclause theories by integrating a variety of different revision techniques into a coherent whole. Forte uses these techniques within a hillclimbing framework, guided by a global heuristic. It identifies possible errors in the theory and calls on a library of operators to develop possible revisions. The best revision is implemented, and the process repeats until no further revisions are possible. Operators are drawn from a variety of sources, including propositional theory refinement, firstorder induction, and inverse resolution. Forte is demonstrated in several domains, including logic programming and qualitative modelling.
Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery
 Data Mining and Knowledge Discovery
, 1999
"... Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current ..."
Abstract

Cited by 41 (14 self)
 Add to MetaCart
Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently. Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting). As a case study, we present two alternative implementations of the ILP system Tilde (Topdown Induction of Logical DEcision trees): Tildeclassic, which loads all data in main memory, and TildeLDS, which loads the examples one by one. We experimentally compare the implementations, showing TildeLDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.
Maximum Entropy Modeling with Clausal Constraints
 In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... We present the learning system Maccent which addresses the novel task of stochastic MAximum ENTropy modeling with Clausal Constraints. Maximum Entropy method is a Bayesian method based on the principle that the target stochastic model should be as uniform as possible, subject to known constraints. ..."
Abstract

Cited by 37 (1 self)
 Add to MetaCart
We present the learning system Maccent which addresses the novel task of stochastic MAximum ENTropy modeling with Clausal Constraints. Maximum Entropy method is a Bayesian method based on the principle that the target stochastic model should be as uniform as possible, subject to known constraints. Maccent incorporates clausal constraints that are based on the evaluation of Prolog clauses in examples represented as Prolog programs. We build on an existing maximumlikelihood approach to maximum entropy modeling, which we upgrade along two dimensions: (1) Maccent can handle larger search spaces, due to a partial ordering defined on the space of clausal constraints, and (2) uses a richer firstorder logic format. In comparison with other inductive logic programming systems, Maccent seems to be the first that explicitly constructs a conditional probability distribution p(CjI) based on an empirical distribution ~ p(CjI) (where p(CjI) (~p(CjI)) gives the induced (observed) probability of ...
Discovery of Relational Association Rules
 Relational data mining
, 2000
"... Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples. ..."
Abstract

Cited by 34 (1 self)
 Add to MetaCart
Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples.
Topdown induction of logical decision trees
 Artificial Intelligence
, 1998
"... Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision ..."
Abstract

Cited by 31 (1 self)
 Add to MetaCart
Topdown induction of decision trees (TDIDT) is a very popular machine learning technique. Up till now, it has mainly been used for propositional learning, but seldomly for relational learning or inductive logic programming. The main contribution of this paper is the introduction of logical decision trees, which make it possible to use TDIDT in inductive logic programming. An implementation of this topdown induction of logical decision trees, the Tilde system, is presented and experimentally evaluated. 1
Hierarchical ModelBased Diagnosis
 International Journal of ManMachine Studies
, 1991
"... Modelbased reasoning about a system requires an explicit representation of the system's components and their connections. Diagnosing such a system consists of locating those components whose abnormal behavior accounts for the faulty system behavior. In order to increase the efficiency of modelbase ..."
Abstract

Cited by 31 (2 self)
 Add to MetaCart
Modelbased reasoning about a system requires an explicit representation of the system's components and their connections. Diagnosing such a system consists of locating those components whose abnormal behavior accounts for the faulty system behavior. In order to increase the efficiency of modelbased diagnosis, we propose a model representation at several levels of detail, and define three refinement (abstraction) operators. We specify formal conditions that have to be satisfied by the hierarchical representation, and emphasize that the multilevel scheme is independent of any particular singlelevel model representation. The hierarchical diagnostic algorithm which we define turns out to be very general. We show that it emulates the bisection method, and can be used for hierarchical constraint satisfaction. We apply the hierarchical modeling principle and diagnostic algorithm to a mediumscale medical problem. The performance of a fourlevel qualitative model of the heart is compared t...