Results 1  10
of
12
Inverse entailment and Progol
, 1995
"... This paper firstly provides a reappraisal of the development of techniques for inverting deduction, secondly introduces ModeDirected Inverse Entailment (MDIE) as a generalisation and enhancement of previous approaches and thirdly describes an implementation of MDIE in the Progol system. Progol ..."
Abstract

Cited by 663 (59 self)
 Add to MetaCart
This paper firstly provides a reappraisal of the development of techniques for inverting deduction, secondly introduces ModeDirected Inverse Entailment (MDIE) as a generalisation and enhancement of previous approaches and thirdly describes an implementation of MDIE in the Progol system. Progol is implemented in C and available by anonymous ftp. The reassessment of previous techniques in terms of inverse entailment leads to new results for learning from positive data and inverting implication between pairs of clauses.
Feature construction with Inductive Logic Programming: a study of quantitative predictions of chemical activity aided by structural attributes
 Data Mining and Knowledge Discovery
, 1996
"... Recently, computer programs developed within the field of Inductive Logic Programming have received some attention for their ability to construct restricted firstorder logic solutions using problemspecific background knowledge. Prominent applications of such programs have been concerned with d ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
Recently, computer programs developed within the field of Inductive Logic Programming have received some attention for their ability to construct restricted firstorder logic solutions using problemspecific background knowledge. Prominent applications of such programs have been concerned with determining "structureactivity" relationships in the areas of molecular biology and chemistry. Typically the task here is to predict the "activity" of a compound, like toxicity, from its chemical structure.
Inductive Logic Programming: derivations, successes and shortcomings
 SIGART Bulletin
, 1993
"... Inductive Logic Programming (ILP) is a research area which investigates the construction of firstorder definite clause theories from examples and background knowledge. ILP systems have been applied successfully in a number of realworld domains. These include the learning of structureactivity rules ..."
Abstract

Cited by 31 (3 self)
 Add to MetaCart
(Show Context)
Inductive Logic Programming (ILP) is a research area which investigates the construction of firstorder definite clause theories from examples and background knowledge. ILP systems have been applied successfully in a number of realworld domains. These include the learning of structureactivity rules for drug design, finiteelement mesh design rules, rules for primarysecondary prediction of protein structure and fault diagnosis rules for satellites. There is a well established tradition of learninginthelimit results in ILP. Recently some results within Valiant's PAClearning framework have also been demonstrated for ILP systems. In this paper it is argued that algorithms can be directly derived from the formal specifications of ILP. This provides a common basis for Inverse Resolution, ExplanationBased Learning, Abduction and Relative Least General Generalisation. A new generalpurpose, efficient approach to predicate invention is demonstrated. ILP is underconstrained by its logical ...
Inverting Implication
 Artificial Intelligence Journal
, 1992
"... All generalisations within logic involve inverting implication. Yet, ever since Plotkin's work in the early 1970's methods of generalising firstorder clauses have involved inverting the clausal subsumption relationship. However, even Plotkin realised that this approach was incomplete. Sin ..."
Abstract

Cited by 26 (2 self)
 Add to MetaCart
(Show Context)
All generalisations within logic involve inverting implication. Yet, ever since Plotkin's work in the early 1970's methods of generalising firstorder clauses have involved inverting the clausal subsumption relationship. However, even Plotkin realised that this approach was incomplete. Since inversion of subsumption is central to many Inductive Logic Programming approaches, this form of incompleteness has been propagated to techniques such as Inverse Resolution and Relative Least General Generalisation. A more complete approach to inverting implication has been attempted with some success recently by Lapointe and Matwin. In the present paper the author derives general solutions to this problem from first principles. It is shown that clausal subsumption is only incomplete for selfrecursive clauses. Avoiding this incompleteness involves algorithms which find "nth roots" of clauses. Completeness and correctness results are proved for a nondeterministic algorithms which constructs nth ro...
A study of two sampling methods for analysing large datasets with ILP
, 1999
"... . This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to cons ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
. This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to construct theories by sampling from this large pool of data. The first, "subsampling", is a singlesample design in which the utility of a potential rule is evaluated on a randomly selected subsample of the data. The second, "logical windowing", is multiplesample design that tests and sequentially includes errors made by a partially correct theory. Both schemes are derived from techniques developed to enable propositional learning methods (like decision trees) to cope with large datasets. The ILP system CProgol, equipped with each of these methods, is used to construct theories for two datasets  one artificial (a chess endgame) and the other naturally occurring (a language tagging problem). I...
A study of two probabilistic methods for searching large spaces with ILP
, 1999
"... Given sample data and background knowledge encoded in the form of logic programs, a predictive Inductive Logic Programming (ILP) system attempts to nd a set of rules (or clauses) for predicting classi cation labels in the data. Most presentday systems for this purpose rely on some variant of a ..."
Abstract

Cited by 22 (3 self)
 Add to MetaCart
Given sample data and background knowledge encoded in the form of logic programs, a predictive Inductive Logic Programming (ILP) system attempts to nd a set of rules (or clauses) for predicting classi cation labels in the data. Most presentday systems for this purpose rely on some variant of a generateandtest procedure that repeatedly examines a set of potential candidates (termed here as the \search space") and selects one or more clauses according to some criterion of \goodness". The worstcase timecomplexity of such systems depends critically on: (1) the size of the search space; and (2) the cost of estimating the goodness of a clause. This paper is concerned with addressing the rst issue and is motivated by two principal factors. First, the representation adopted by an ILP system often engenders a search space whose size dominates complexity calculations. Straightforward arguments show that examining fewer clauses should lead to faster execution times. Second,...
Numerical reasoning with an ILP system capable of lazy evaluation and customised search
 Journal of Logic Programming
, 1999
"... Using problemspecific background knowledge, computer programs developed within the framework of Inductive Logic Programming (ILP) have been used to construct restricted firstorder logic solutions to scientific problems. However, their approach to the analysis of data with substantial numerical ..."
Abstract

Cited by 16 (6 self)
 Add to MetaCart
Using problemspecific background knowledge, computer programs developed within the framework of Inductive Logic Programming (ILP) have been used to construct restricted firstorder logic solutions to scientific problems. However, their approach to the analysis of data with substantial numerical content has been largely limited to constructing clauses that: (a) provide qualitative descriptions ("high", "low" etc.) of the values of response variables; and (b) contain simple inequalities restricting the ranges of predictor variables. This has precluded the application of such techniques to scientific and engineering problems requiring a more sophisticated approach. A number of specialised methods have been suggested to remedy this. In contrast, we have chosen to take advantage of the fact that the existing theoretical framework for ILP places very few restrictions of the nature of the background knowledge. We describe two issues of implementation that make it possible to us...
Extracting contextsensitive models in Inductive Logic Programming
 Machine Learning
, 2001
"... Given domainspecific background knowledge and data in the form of examples, an Inductive Logic Programming (ILP) system extracts models in the dataanalytic sense. We view the modelselection step facing an ILP system as a decision problem, the solution of which requires knowledge of the context in ..."
Abstract

Cited by 13 (1 self)
 Add to MetaCart
(Show Context)
Given domainspecific background knowledge and data in the form of examples, an Inductive Logic Programming (ILP) system extracts models in the dataanalytic sense. We view the modelselection step facing an ILP system as a decision problem, the solution of which requires knowledge of the context in which the model is to be deployed. In this paper, "context" will be defined by the current specification of the prior class distribution and the client's preferences concerning errors of classification. Within this restricted setting, we consider the use of an ILP system in situations where: (a) contexts can change regularly. This can arise for example, from changes to class distributions or misclassification costs; and (b) the data are from observational studies. That is, they may not have been collected with any particular context in mind. Some repercussions of these are: (a) any one model may not be the optimal choice for all contexts; and (b) not all the background information provided may be relevant for all contexts. Using results from the analysis of Receiver Operating Characteristic curves, we investigate a technique that can equip an ILP system to reject those models that cannot possibly be optimal in any context. We present empirical results from using the technique to analyse two datasets concerned with the toxicity of chemicals (in particular, their mutagenic and carcinogenic properties). Clients can and typically do, approach such datasets with quite different requirements. For example, a synthetic chemist would require models with a low rate of commission errors which could be used to direct efficiently the synthesis of new compounds. A toxicologist on the other hand, would prefer models with a low rate of omission errors. This would enable a more complete identificati...
An empirical study of the use of relevance information in Inductive Logic Programming
"... Inductive Logic Programming (ILP) systems construct models for data using domainspecific background information. When using these systems, it is typically assumed that sufficient human expertise is at hand to rule out irrelevant background information. Such irrelevant information can, and typically ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Inductive Logic Programming (ILP) systems construct models for data using domainspecific background information. When using these systems, it is typically assumed that sufficient human expertise is at hand to rule out irrelevant background information. Such irrelevant information can, and typically does, hinder an ILP system's search for good models. Here, we provide evidence that if additional expertise is available that can provide a partialordering on sets of background predicates in terms of relevance to the analysis task, then this can be used to good effect by an ILP system. In particular, using data from biochemical domains, we investigate an incremental strategy of including sets of predicates in decreasing order of relevance. Results obtained suggest that: (a) the incremental approach identifies, in less time, a model that is comparable in predictive accuracy to that obtained with all background information in place; and (b) the incremental approach using the relevance ordering performs better than one that does not (that is, one that adds sets of predicates randomly). For a practitioner concerned with use of ILP, the implication of these findings are twofold: (1) when not all background information can be used at once (either due to limitations of the ILP system, or the nature of the domain) expert assessment of the relevance of background predicates can assist substantially in the construction of good models; and (2) good "firstcut" results can be obtained quickly by a simple exclusion of information known to be less relevant.
A note on two simple transformations for improving the efficiency of an ILP system
, 2000
"... Inductive Logic Programming (ILP) systems have had noteworthy successes in extracting comprehensible and accurate models for data drawn from a number of scientifc and engineering domains. These results suggest that ILP methods could enhance the modelconstruction capabilities of software tools being ..."
Abstract

Cited by 5 (3 self)
 Add to MetaCart
Inductive Logic Programming (ILP) systems have had noteworthy successes in extracting comprehensible and accurate models for data drawn from a number of scientifc and engineering domains. These results suggest that ILP methods could enhance the modelconstruction capabilities of software tools being developed for the emerging discipline of "knowledge discovery from databases." One significant concern in the use of ILP for this purpose is that of efficiency. The performance of modern ILP systems is principally affected by two issues: (1) they often have to search through very large numbers of possible rules (usually in the form of definite clauses); (2) they have to score each rule on the data (usually in the form of ground facts) to estimate "goodness". Stochastic and greedy approaches have been proposed to alleviate the complexity arising from each of these issues. While these techniques can result in orderofmagnitude improvements in the worstcase search complexity of an ...