Results 1 - 10
of
418
Learning Stochastic Logic Programs
, 2000
"... Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic context-free grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a first-order range- ..."
Abstract
-
Cited by 962 (56 self)
- Add to MetaCart
Stochastic Logic Programs (SLPs) have been shown to be a generalisation of Hidden Markov Models (HMMs), stochastic context-free grammars, and directed Bayes' nets. A stochastic logic program consists of a set of labelled clauses p:C where p is in the interval [0,1] and C is a first-order range-restricted definite clause. This paper summarises the syntax, distributional semantics and proof techniques for SLPs and then discusses how a standard Inductive Logic Programming (ILP) system, Progol, has been modied to support learning of SLPs. The resulting system 1) nds an SLP with uniform probability labels on each definition and near-maximal Bayes posterior probability and then 2) alters the probability labels to further increase the posterior probability. Stage 1) is implemented within CProgol4.5, which differs from previous versions of Progol by allowing user-defined evaluation functions written in Prolog. It is shown that maximising the Bayesian posterior function involves nding SLPs with short derivations of the examples. Search pruning with the Bayesian evaluation function is carried out in the same way as in previous versions of CProgol. The system is demonstrated with worked examples involving the learning of probability distributions over sequences as well as the learning of simple forms of uncertain knowledge.
Bottom-Up Relational Learning of Pattern Matching Rules for Information Extraction
, 2003
"... Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for ..."
Abstract
-
Cited by 277 (16 self)
- Add to MetaCart
Information extraction is a form of shallow text processing that locates a specified set of relevant items in a natural-language document. Systems for this task require significant domain-specific knowledge and are time-consuming and difficult to build by hand, making them a good application for machine learning. We present an algorithm, RAPIER, that uses pairs of sample documents and filled templates to induce pattern-match rules that directly extract fillers for the slots in the template. RAPIER is a bottom-up learning algorithm that incorporates techniques from several inductive logic programming systems. We have implemented the algorithm in a system that allows patterns to have constraints on the words, part-of-speech tags, and semantic classes present in the filler and the surrounding text. We present encouraging experimental results on two domains.
Clausal Discovery
- Machine Learning
, 1996
"... The clausal discovery engine Claudien is presented. Claudien is an inductive logic programming engine that fits in the knowledge discovery in databases and data mining paradigm as it discovers regularities that are valid in data. As such Claudien performs a novel induction task, which is called char ..."
Abstract
-
Cited by 170 (32 self)
- Add to MetaCart
The clausal discovery engine Claudien is presented. Claudien is an inductive logic programming engine that fits in the knowledge discovery in databases and data mining paradigm as it discovers regularities that are valid in data. As such Claudien performs a novel induction task, which is called characteristic induction from closed observations, and which is related to existing formalizations of induction in logic. In characterising induction from closed observations, the regularities are represented by clausal theories, and the data using Herbrand interpretations. Claudien also employs a novel declarative bias mechanism to define the set of clauses that may appear in a hypothesis. Keywords : Inductive Logic Programming, Knowledge Discovery in Databases, Data Mining, Learning, Induction, Semantics for Induction, Logic of Induction, Parallel Learning. 1 Introduction Despite the fact that the areas of knowledge discovery in databases [Fayyad et al., 1995] and inductive logic programmin...
Learning Trees and Rules with Set-valued Features
, 1996
"... In most learning systems examples are represented as fixed-length "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to represent a ..."
Abstract
-
Cited by 163 (2 self)
- Add to MetaCart
In most learning systems examples are represented as fixed-length "feature vectors", the components of which are either real numbers or nominal values. We propose an extension of the featurevector representation that allows the value of a feature to be a set of strings; for instance, to represent a small white and black dog with the nominal features size and species and the setvalued feature color, one might use a feature vector with size=small, species=canis-familiaris and color=fwhite,blackg. Since we make no assumptions about the number of possible set elements, this extension of the traditional feature-vector representation is closely connected to Blum's "infinite attribute" representation. We argue that many decision tree and rule learning algorithms can be easily extended to setvalued features. We also show by example that many real-world learning problems can be efficiently and naturally represented with set-valued features; in particular, text categorization problems and probl...
Theories for Mutagenicity: A Study in First-Order and Feature-Based Induction
- Artificial Intelligence
, 1996
"... A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler feature-based methods. One area that has long been asso ..."
Abstract
-
Cited by 141 (29 self)
- Add to MetaCart
A classic problem from chemistry is used to test a conjecture that in domains for which data are most naturally represented by graphs, theories constructed with Inductive Logic Programming (ILP) will significantly outperform those using simpler feature-based methods. One area that has long been associated with graph-based or structural representation and reasoning is organic chemistry. In this field, we consider the problem of predicting the mutagenic activity of small molecules: a property that is related to carcinogenicity, and an important consideration in developing less hazardous drugs. By providing an ILP system with progressively more structural information concerning the molecules, we compare the predictive power of the logical theories constructed against benchmarks set by regression, neural, and tree-based methods. 1 Introduction Constructing theories to explain observations occupies much of the creative hours of scientists and engineers. Programs from the field of Inductiv...
Separate-and-conquer rule learning
- Artificial Intelligence Review
, 1999
"... This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of ..."
Abstract
-
Cited by 118 (29 self)
- Add to MetaCart
This paper is a survey of inductive rule learning algorithms that use a separate-and-conquer strategy. This strategy can be traced back to the AQ learning system and still enjoys popularity as can be seen from its frequent use in inductive logic programming systems. We will put this wide variety of algorithms into a single framework and analyze them along three different dimensions, namely their search, language and overfitting avoidance biases.
An Algorithm for Multi-Relational Discovery of Subgroups
, 1997
"... We consider the problem of finding statistically unusual subgroups in a multi-relation database, and extend previous work on singlerelation subgroup discovery. We give a precise definition of the multirelation subgroup discovery task, propose a specific form of declarative bias based on foreign ..."
Abstract
-
Cited by 105 (8 self)
- Add to MetaCart
We consider the problem of finding statistically unusual subgroups in a multi-relation database, and extend previous work on singlerelation subgroup discovery. We give a precise definition of the multirelation subgroup discovery task, propose a specific form of declarative bias based on foreign links as a means of specifying the hypothesis space, and show how propositional evaluation functions can be adapted to the multi-relation setting. We then describe an algorithm for this problem setting that uses optimistic estimate and minimal support pruning, an optimal refinement operator and sampling to ensure efficiency and can easily be parallelized.
Relational Learning Techniques for Natural Language Information Extraction
, 1998
"... The recent growth of online information available in the form of natural language documents creates a greater need for computing systems with the ability to process those documents to simplify access to the information. One type of processing appropriate for many tasks is information extraction, a t ..."
Abstract
-
Cited by 73 (4 self)
- Add to MetaCart
The recent growth of online information available in the form of natural language documents creates a greater need for computing systems with the ability to process those documents to simplify access to the information. One type of processing appropriate for many tasks is information extraction, a type of text skimming that retrieves specific types of information from text. Although information extraction systems have existed for two decades, these systems have generally been built by hand and contain domain specific information, making them difficult to port to other domains. A few researchers have begun to apply machine learning to information extraction tasks, but most of this work has involved applying learning to pieces of a much larger system. This paper presents a novel rule representation specific to natural language and a learning system, Rapier, which learns information extraction rules. Rapier takes pairs of documents and filled templates indicating the information to be ext...
An efficient algorithm for discovering frequent subgraphs
- IEEE Transactions on Knowledge and Data Engineering
, 2002
"... Abstract — Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to non-traditional domains, existing frequent pattern discovery approach cannot be used. This i ..."
Abstract
-
Cited by 68 (5 self)
- Add to MetaCart
Abstract — Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to non-traditional domains, existing frequent pattern discovery approach cannot be used. This is because the transaction framework that is assumed by these algorithms cannot be used to effectively model the datasets in these domains. An alternate way of modeling the objects in these datasets is to represent them using graphs. Within that model, one way of formulating the frequent pattern discovery problem is as that of discovering subgraphs that occur frequently over the entire set of graphs. In this paper we present a computationally efficient algorithm, called FSG, for finding all frequent subgraphs in large graph datasets. We experimentally evaluate the performance of FSG using a variety of real and synthetic datasets. Our results show that despite the underlying complexity associated with frequent subgraph discovery, FSG is effective in finding all frequently occurring subgraphs in datasets containing over 200,000 graph transactions and scales linearly with respect to the size of the dataset. Index Terms — Data mining, scientific datasets, frequent pattern discovery, chemical compound datasets.
Relational Instance-Based Learning
- Proceedings of the Thirteenth International Conference on Machine Learning
, 1996
"... A relational instance-based learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values a ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
A relational instance-based learning algorithm, called Ribl, is motivated and developed in this paper. We argue that instancebased methods o#er solutions to the often unsatisfactory behavior of current inductive logic programming #ILP# approaches in domains with continuous attribute values and in domains with noisy attributes and#or examples. Three research issues that emerge when a propositional instance-based learner is adapted to a #rst-order representation are identi#ed: #1# construction of cases from the knowledge base, #2# computation of similaritybetween arbitrarily complex cases, and #3# estimation of the relevance of predicates and attributes. Solutions to these issues are developed. Empirical results indicate that Ribl is able to achieve high classi#cation accuracy in a variety of domains. to appear in: Proc. 13th International Conference on Machine Learning, L. Saitta #ed.#, Morgan Kaufmann, 1996 1 Introduction The #eld of Inductive Logic Programming ...

