Results 1  10
of
81
Discovery of frequent Datalog patterns
, 1999
"... Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e., all combinations of items that are found in a sufficient number of examples. The fundamental task of as ..."
Abstract

Cited by 128 (9 self)
 Add to MetaCart
Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e., all combinations of items that are found in a sufficient number of examples. The fundamental task of association rule and frequent set discovery has been extended in various directions, allowing more useful patterns to be discovered with special purpose algorithms. We present Warmr, a general purpose inductive logic programming algorithm that addresses frequent query discovery: a very general Datalog formulation of the frequent pattern discovery problem.
Relational Reinforcement Learning
, 2001
"... Relational reinforcement learning is presented, a learning technique that combines reinforcement learning with relational learning or inductive logic programming. Due to the use of a more expressive representation language to represent states, actions and Qfunctions, relational reinforcement learni ..."
Abstract

Cited by 103 (6 self)
 Add to MetaCart
Relational reinforcement learning is presented, a learning technique that combines reinforcement learning with relational learning or inductive logic programming. Due to the use of a more expressive representation language to represent states, actions and Qfunctions, relational reinforcement learning can be potentially applied to a new range of learning tasks. One such task that we investigate is planning in the blocks world, where it is assumed that the effects of the actions are unknown to the agent and the agent has to learn a policy. Within this simple domain we show that relational reinforcement learning solves some existing problems with reinforcement from specific goals pursued and to exploit the results of previous learning phases when addressing new (more complex) situations.
Fast discriminative visual codebooks using randomized clustering forests
 In NIPS
, 2007
"... Some of the most effective recent methods for contentbased image classification work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as kmeans vector quantization, accumulating histograms of the resulting “visual word ” codes over the image, a ..."
Abstract

Cited by 101 (4 self)
 Add to MetaCart
Some of the most effective recent methods for contentbased image classification work by extracting dense or sparse local image descriptors, quantizing them according to a coding rule such as kmeans vector quantization, accumulating histograms of the resulting “visual word ” codes over the image, and classifying these with a conventional classifier such as an SVM. Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using kmeans. We introduce Extremely Randomized Clustering Forests – ensembles of randomly created clustering trees – and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several stateoftheart image classification tasks. 1
Collaborative Filtering: A Machine Learning Perspective
, 2004
"... Collaborative filtering was initially proposed as a framework for filtering information based on the preferences of users, and has since been refined in many different ways. This thesis is a comprehensive study of ratingbased, pure, nonsequential collaborative filtering. We analyze existing method ..."
Abstract

Cited by 59 (3 self)
 Add to MetaCart
Collaborative filtering was initially proposed as a framework for filtering information based on the preferences of users, and has since been refined in many different ways. This thesis is a comprehensive study of ratingbased, pure, nonsequential collaborative filtering. We analyze existing methods for the task of rating prediction from a machine learning perspective. We show that many existing methods proposed for this task are simple applications or modi cations of one or more standard machine learning methods for classifi cation, regression, clustering, dimensionality reduction, and density estimation. We introduce new prediction methods in all of these classes. We introduce a new experimental procedure for testing stronger forms of generalization than has been used previously. We implement a total of nine prediction methods, and conduct large scale prediction accuracy experiments. We show interesting new results on the relative performance of these methods.
Improving the efficiency of inductive logic programming through the use of query packs
 JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH
, 2002
"... Inductive logic programming, or relational learning, is a powerful paradigm for machine learning or data mining. However, in order for ILP to become practically useful, the efficiency of ILP systems must improve substantially. To this end, the notion of a query pack is introduced: it structures sets ..."
Abstract

Cited by 56 (19 self)
 Add to MetaCart
Inductive logic programming, or relational learning, is a powerful paradigm for machine learning or data mining. However, in order for ILP to become practically useful, the efficiency of ILP systems must improve substantially. To this end, the notion of a query pack is introduced: it structures sets of similar queries. Furthermore, a mechanism is described for executing such query packs. A complexity analysis shows that considerable efficiency improvements can be achieved through the use of this query pack execution mechanism. This claim is supported by empirical results obtained by incorporating support for query pack execution in two existing learning systems.
Randomized clustering forests for image classification
 Pattern Analysis and Machine Intelligence
"... Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, a ..."
Abstract

Cited by 43 (4 self)
 Add to MetaCart
Abstract—This paper introduces three new contributions to the problems of image classification and image search. First, we propose a new image patch quantization algorithm. Other competitive approaches require a large code book and the sampling of many local regions for accurate image description, at the expense of a prohibitive processing time. We introduce Extremely Randomized Clustering Forests—ensembles of randomly created clustering trees—that are more accurate, much faster to train and test, and more robust to background clutter compared to stateoftheart methods. Second, we propose an efficient image classification method that combines ERCForests and saliency maps very closely with image information sampling. For a given image, a classifier builds a saliency map online, which it uses for classification. We demonstrate speed and accuracy improvement in several stateoftheart image classification tasks. Finally, we show that our ERCForests are used very successfully for learning distances between images of neverseen objects. Our algorithm learns the characteristic differences between local descriptors sampled from pairs of the “same ” or “different ” objects, quantizes these differences with ERCForests, and computes the similarity from this quantization. We show significant improvement over stateoftheart competitive approaches. Index Terms—Randomized trees, image classification, object recognition, similarity measure. Ç 1
Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery
 Data Mining and Knowledge Discovery
, 1999
"... Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current ..."
Abstract

Cited by 42 (14 self)
 Add to MetaCart
Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently. Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting). As a case study, we present two alternative implementations of the ILP system Tilde (Topdown Induction of Logical DEcision trees): Tildeclassic, which loads all data in main memory, and TildeLDS, which loads the examples one by one. We experimentally compare the implementations, showing TildeLDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.
A Polynomial Time Computable Metric Between Point Sets
, 2000
"... Measuring the similarity or distance between two sets of points in a metric space is an important problem in machine learning and has also applications in other disciplines e.g. in computational geometry, philosophy of science, methods for updating or changing theories, . . . . Recently Eiter and Ma ..."
Abstract

Cited by 41 (3 self)
 Add to MetaCart
Measuring the similarity or distance between two sets of points in a metric space is an important problem in machine learning and has also applications in other disciplines e.g. in computational geometry, philosophy of science, methods for updating or changing theories, . . . . Recently Eiter and Mannila have proposed a new measure which is computable in polynomial time. However, it is not a distance function in the mathematical sense because it does not satisfy the triangle inequality.
Speeding up Relational Reinforcement Learning Through the Use of an Incremental First Order Decision Tree Learner
 Proceedings of the 13th European Conference on Machine Learning
, 2001
"... Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural knowledge about the application domain. ..."
Abstract

Cited by 36 (20 self)
 Add to MetaCart
Relational reinforcement learning (RRL) is a learning technique that combines standard reinforcement learning with inductive logic programming to enable the learning system to exploit structural knowledge about the application domain.
A Framework for Defining Distances Between FirstOrder Logic Objects
, 1998
"... this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. More ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
this paper we develop a framework for distances between clauses and distances between models. The framework can be parametrised by a measure for the distance between atoms. It takes into account subterms common to distinct atoms of a set of atoms in the measurement of the distance between sets. Moreover, for a constant number of variables, the complexity of the distance computation is polynomially bounded by the size of the objects. Initial experiments show that the framework can be the basis of good clustering algorithms. The framework consists of three levels: At the first level one chooses a distance between atoms . The second level upgrades this distance to a distance between sets of atoms. We propose a framework that is a generalisation of three polynomial time computable similarity measures proposed by Eiter and Mannila, and an instance which is a real distance function, computable in polynomial time. We develop also a binary prototype function for sets of points. Prototype fun