Results 1 - 10
of
19
Top-Down Induction of Clustering Trees
- In Proceedings of the 15th International Conference on Machine Learning
, 1998
"... An approach to clustering is presented that adapts the basic top-down induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for firs ..."
Abstract
-
Cited by 83 (21 self)
- Add to MetaCart
An approach to clustering is presented that adapts the basic top-down induction of decision trees method towards clustering. To this aim, it employs the principles of instance based learning. The resulting methodology is implemented in the TIC (Top down Induction of Clustering trees) system for first order clustering. The TIC system employs the first order logical decision tree representation of the inductive logic programming system Tilde. Various experiments with TIC are presented, in both propositional and relational domains.
Distance Between Herbrand Interpretations: a measure for approximations to a target concept
, 1997
"... . We can use a metric to measure the di#erences between elements in a domain or subsets of that domain #i.e. concepts#. Which particular metric should be chosen, depends on the kind of di#erence wewant to measure. The well known Euclidean metric on # n and its generalizations are often used f ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
. We can use a metric to measure the di#erences between elements in a domain or subsets of that domain #i.e. concepts#. Which particular metric should be chosen, depends on the kind of di#erence wewant to measure. The well known Euclidean metric on # n and its generalizations are often used for this purpose, but such metrics are not always suitable for concepts where elements have some structure di#erent from real numbers. For example, in #Inductive# Logic Programming a concept is often expressed as an Herbrand interpretation of some #rstorder language. Every element in an Herbrand interpretation is a ground atom which has a tree structure. We start by de#ning a metric d on the set of expressions #ground atoms and ground terms#, motivated by the structure and complexity of the expressions and the symbols used therein. This metric induces the Hausdor # metric h on the set of all sets of ground atoms, which allows us to measure the distance between Herbrand interpretatio...
Relational Distance-Based Clustering
, 1998
"... Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on ..."
Abstract
-
Cited by 28 (0 self)
- Add to MetaCart
Work on first-order clustering has primarily been focused on the task of conceptual clustering, i.e., forming clusters with symbolic generalizations in the given representation language. By contrast, for propositional representations, experience has shown that simple algorithms based exclusively on distance measures can often outperform their concept-based counterparts. In this paper, we therefore build on recent advances in the area of #rst-order distance metrics and present RDBC, a bottom-up agglomerative clustering algorithm for #rst-order representations that relies on distance information only and features a novel parameter-free pruning measure for selecting the #nal clustering from the cluster tree. The algorithm can empirically be shown to produce good clusterings #on the mutagenesis domain# that, when used for subsequent prediction tasks, improve on previous clustering results and approach the accuracies of dedicated predictive learners.
A Framework for Defining Distances Between First-Order Logic Objects
, 1998
"... this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. More ..."
Abstract
-
Cited by 26 (3 self)
- Add to MetaCart
this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. Moreover, for a constant number of variables, the complexity of the distance computation is polynomially bounded by the size of the objects. Initial experiments show that the framework can be the basis of good clustering algorithms. The framework consists of three levels: At the first level one chooses a distance between atoms . The second level upgrades this distance to a distance between sets of atoms. We propose a framework that is a generalisation of three polynomial time computable similarity measures proposed by Eiter and Mannila, and an instance which is a real distance function, computable in polynomial time. We develop also a binary prototype function for sets of points. Prototype fun
Using Logical Decision Trees for Clustering
- In Proceedings of the 7th International Workshop on Inductive Logic Programming
, 1997
"... A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experimen ..."
Abstract
-
Cited by 22 (2 self)
- Add to MetaCart
A novel first order clustering system, called C 0.5, is presented. It inherits its logical decision tree formalism from the TILDE system, but instead of using class information to guide the search, it employs the principles of instance based learning in order to perform clustering. Various experiments are discussed, which show the promise of the approach. 1 Introduction A decision tree is usually seen as representing a theory for classification of examples. If the examples are positive and negative examples for one specific concept, then the tree defines these two concepts. One could also say, if there are k classes, that the tree defines k concepts. Another viewpoint is taken in Langley's Elements of Machine Learning [ Langley, 1996 ] . Langley sees decision tree induction as a special case of the induction of concept hierarchies. A concept is associated with each node of the tree, and as such the tree represents a kind of taxonomy, a hierarchy of many concepts. This is very similar...
Hierarchical Multi-Classification
, 2002
"... The problem of hierarchical multi-classification is considered. ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
The problem of hierarchical multi-classification is considered.
Distance Measures Between Atoms
- In Proceedings of the CompulogNet Area Meeting on 'Computational Logic and Machine Learning
, 1998
"... Many learning systems, e.g. systems based on clustering and instance based learning systems, need a measure for the distance between objects. Adequate measures are available for attribute value learners. In recent years there is a growing interest in first order learners, however existing proposals ..."
Abstract
-
Cited by 14 (2 self)
- Add to MetaCart
Many learning systems, e.g. systems based on clustering and instance based learning systems, need a measure for the distance between objects. Adequate measures are available for attribute value learners. In recent years there is a growing interest in first order learners, however existing proposals for distances between non-ground atoms have some drawbacks. In this paper we develop a new measure for the distance between nonground atoms. 1 Introduction In learning systems based on clustering (e.g. C0.5 [3], KBG [1]) and in instance based learning (e.g. [9, ch.4], RIBL [6]), a measure of the distance between objects is an essential component. Good measures exist for distances between objects in an attribute value representation (see e.g. [9, ch. 4]). Recently there is a growing interest in using more expressive first order representations of objects and in upgrading propositional learning systems into first order learning systems (e.g. TILDE [2], ICL [5] and CLAUDIEN [4]). Some ad-hoc s...
Analogical Prediction
, 1999
"... Inductive Logic Programming (ILP) involves constructing an hypothesis H on the basis of background knowledge B and training examples E. An independent test set is used to evaluate the accuracy of H. This paper concerns an alternative approach called Analogical Prediction (AP). AP takes B; E and the ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Inductive Logic Programming (ILP) involves constructing an hypothesis H on the basis of background knowledge B and training examples E. An independent test set is used to evaluate the accuracy of H. This paper concerns an alternative approach called Analogical Prediction (AP). AP takes B; E and then for each test example hx; yi forms an hypothesis Hx from B; E; x. Evaluation of AP is based on estimating the probability that Hx(x) = y for a randomly chosen hx; yi. AP has been implemented within CProgol4.4. Experiments in the paper show that on English past tense data AP has signicantly higher predictive accuracy on this data than both previously reported results and CProgol in inductive mode. However, on KRK illegal AP does not outperform CProgol in inductive mode. We conjecture that AP has advantages for domains in which a large proportion of the examples must be treated as exceptions with respect to the hypothesis vocabulary. The relationship of AP to analogy and instance-based lear...
Instance Based Function Learning
- In Proceedings of the Ninth International Workshop on Inductive Logic Programming, Lecture Notes in Arti Intelligence
, 1999
"... . The principles of instance based function learning are presented. In IBFL one is given a set of positive examples of a functional predicate. These examples are true ground facts that illustrate the input output behaviour of the predicate. The purpose is then to predict the output of the predic ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
. The principles of instance based function learning are presented. In IBFL one is given a set of positive examples of a functional predicate. These examples are true ground facts that illustrate the input output behaviour of the predicate. The purpose is then to predict the output of the predicate given a new input. Further assumptions are that there is no background theory and that the inputs and outputs of the predicate consist of structured terms. IBFL is a novel technique that addresses this problem and that combines ideas from instance based learning, first order distances and analogical or case based reasoning. We also argue that IBFL is especially useful when there is a need for handling complex and deeply nested terms. Though we present the technique in isolation, it might be more useful as a component of a larger system to deal e.g. with the logic, language and learning challenge. 1
Metric-Based Inductive Learning Using Semantic Height Functions
, 2000
"... In the present paper we propose a consistent way to integrate syntactical least general generalizations (lgg's) with semantic evaluation of the hypotheses. For this purpose we use two different relations on the hypothesis space -- a constructive one, used to generate lgg's and a semantic one giv ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In the present paper we propose a consistent way to integrate syntactical least general generalizations (lgg's) with semantic evaluation of the hypotheses. For this purpose we use two different relations on the hypothesis space -- a constructive one, used to generate lgg's and a semantic one giving the coverage-based evaluation of the lgg. These two relations jointly implement a semantic distance measure. The formal background for this is a height-based definition of a semi-distance in a join semi-lattice. We use some basic results from lattice theory and introduce a family of language independent coverage-based height functions.

