Results 1  10
of
17
A simple, fast, and effective rule learner
 IN PROCEEDINGS OF ANNUAL CONFERENCE OFAMERICAN ASSOCIATION FOR ARTI CIAL INTELLIGENCE
, 1999
"... We describe SLIPPER, a new rule learner that generates rulesets by repeatedly boosting a simple, greedy, rulebuilder. Like the rulesets built by other rule learners, the ensemble of rules created by SLIPPER is compact and comprehensible. This is made possible by imposing appropriate constraints on ..."
Abstract

Cited by 93 (3 self)
 Add to MetaCart
We describe SLIPPER, a new rule learner that generates rulesets by repeatedly boosting a simple, greedy, rulebuilder. Like the rulesets built by other rule learners, the ensemble of rules created by SLIPPER is compact and comprehensible. This is made possible by imposing appropriate constraints on the rulebuilder, and by use of a recentlyproposed generalization of Adaboost called confidencerated boosting. In spite of its relative simplicity, SLIPPER is highly scalable, and an effiective learner. Experimentally, SLIPPER scales no worse than O(n log n), where n is the number of examples, and on a set of 32 benchmark problems, SLIPPER achieves lower error rates than RIPPER 20 times, and lower error rates than C4.5rules 22 times.
A Survey of Methods for Scaling Up Inductive Algorithms
 Data Mining and Knowledge Discovery
, 1999
"... . One of the defining challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. This paper summarizes, categorizes, and compares existing work on scaling up inductive algorithms. We concentrate on algorithms that build decision trees and rule ..."
Abstract

Cited by 85 (10 self)
 Add to MetaCart
. One of the defining challenges for the KDD research community is to enable inductive learning algorithms to mine very large databases. This paper summarizes, categorizes, and compares existing work on scaling up inductive algorithms. We concentrate on algorithms that build decision trees and rule sets, in order to provide focus and specific details; the issues and techniques generalize to other types of data mining. We begin with a discussion of important issues related to scaling up. We highlight similarities among scaling techniques by categorizing them into three main approaches. For each approach, we then describe, compare, and contrast the different constituent techniques, drawing on specific examples from published papers. Finally, we use the preceding analysis to suggest how to proceed when dealing with a large problem, and where to focus future research. Keywords: scaling up, inductive learning, decision trees, rule learning 1. Introduction The knowledge discovery and data...
Scaling up inductive logic programming by learning from interpretations. Data Mining and Knowledge Discovery
 Data Mining and Knowledge Discovery
, 1999
"... Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current ..."
Abstract

Cited by 41 (14 self)
 Add to MetaCart
Abstract. When comparing inductive logic programming (ILP) and attributevalue learning techniques, there is a tradeoff between expressive power and efficiency. Inductive logic programming techniques are typically more expressive but also less efficient. Therefore, the data sets handled by current inductive logic programming systems are small according to general standards within the data mining community. The main source of inefficiency lies in the assumption that several examples may be related to each other, so they cannot be handled independently. Within the learning from interpretations framework for inductive logic programming this assumption is unnecessary, which allows to scale up existing ILP algorithms. In this paper we explain this learning setting in the context of relational databases. We relate the setting to propositional data mining and to the classical ILP setting, and show that learning from interpretations corresponds to learning from multiple relations and thus extends the expressiveness of propositional learning, while maintaining its efficiency to a large extent (which is not the case in the classical ILP setting). As a case study, we present two alternative implementations of the ILP system Tilde (Topdown Induction of Logical DEcision trees): Tildeclassic, which loads all data in main memory, and TildeLDS, which loads the examples one by one. We experimentally compare the implementations, showing TildeLDS can handle large data sets (in the order of 100,000 examples or 100 MB) and indeed scales up linearly in the number of examples.
A study of two sampling methods for analysing large datasets with ILP
, 1999
"... . This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to cons ..."
Abstract

Cited by 24 (5 self)
 Add to MetaCart
. This paper is concerned with problems that arise when submitting large quantities of data to analysis by an Inductive Logic Programming (ILP) system. Complexity arguments usually make it prohibitive to analyse such datasets in their entirety. We examine two schemes that allow an ILP system to construct theories by sampling from this large pool of data. The first, "subsampling", is a singlesample design in which the utility of a potential rule is evaluated on a randomly selected subsample of the data. The second, "logical windowing", is multiplesample design that tests and sequentially includes errors made by a partially correct theory. Both schemes are derived from techniques developed to enable propositional learning methods (like decision trees) to cope with large datasets. The ILP system CProgol, equipped with each of these methods, is used to construct theories for two datasets  one artificial (a chess endgame) and the other naturally occurring (a language tagging problem). I...
Distributed Data Mining: Scaling up and beyond
 In Advances in Distributed and Parallel Knowledge Discovery
, 1999
"... In this chapter I begin by discussing Distributed Data Mining (DDM) for scaling up, beginning by asking what scaling up means, questioning whether it is necessary, and then presenting a brief survey of what has been done to date. I then provide motivation beyond scaling up, arguing that DDM is a mor ..."
Abstract

Cited by 14 (0 self)
 Add to MetaCart
In this chapter I begin by discussing Distributed Data Mining (DDM) for scaling up, beginning by asking what scaling up means, questioning whether it is necessary, and then presenting a brief survey of what has been done to date. I then provide motivation beyond scaling up, arguing that DDM is a more natural way to view data mining generally. DDM eliminates many difficulties encountered when coalescing alreadydistributed data for monolithic data mining, such as those associated with heterogeneity of data and with privacy restrictions. By viewing data mining as inherently distributed, important open research issues come into focus, issues that currently are obscured by the lack of explicit treatment of the process of producing monolithic data sets. I close with a discussion of the necessity of DDM for an efficient process of knowledge discovery.
Speedingup pittsburgh learning classifier systems: Modeling time and accuracy
 In Parallel Problem Solving from Nature  PPSN 2004
, 2004
"... Abstract. Windowing methods are useful techniques to reduce the computational cost of Pittsburghstyle geneticbased machine learning techniques. If used properly, they additionally can be used to improve the classification accuracy of the system. In this paper we develop a theoretical framework for ..."
Abstract

Cited by 11 (9 self)
 Add to MetaCart
Abstract. Windowing methods are useful techniques to reduce the computational cost of Pittsburghstyle geneticbased machine learning techniques. If used properly, they additionally can be used to improve the classification accuracy of the system. In this paper we develop a theoretical framework for a windowing scheme called ILAS, developed previously by the authors. The framework allows us to approximate the degree of windowing we can apply to a given dataset as well as the gain in runtime. The framework sets the first stage for the development of a larger methodology with several types of learning strategies in which we can apply ILAS, such as maximizing the learning performance of the system, or achieving the maximum runtime reduction without significant accuracy loss. 1
On the Use of Fast Subsampling Estimates for Algorithm Recommendation
 Österreichisches Forschungsinstitut für Artificial Intelligence
, 2002
"... The use of subsampling for scaling up the performance of learning algorithms has become fairly popular in the recent literature. In this paper, we investigate the use of performance estimates obtained on a subsample of the data for the task of recommending the best learning algorithm(s) for the p ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
The use of subsampling for scaling up the performance of learning algorithms has become fairly popular in the recent literature. In this paper, we investigate the use of performance estimates obtained on a subsample of the data for the task of recommending the best learning algorithm(s) for the problem. In particular, we examine the use of subsampling estimates as features for metalearning, thereby generalizing previous work on landmarking and on direct algorithm recommendation via subsampling.
Dimensionality Reduction in ILP: A Call To Arms
"... The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle largescale applications that are currently beyond their scope. One way to address this problem is to use techniques for reducing the dimensionality of the learning ..."
Abstract

Cited by 6 (1 self)
 Add to MetaCart
The recent uprise of Knowledge Discovery in Databases (KDD) has underlined the need for machine learning algorithms to be able to tackle largescale applications that are currently beyond their scope. One way to address this problem is to use techniques for reducing the dimensionality of the learning problem by reducing the hypothesis space and/or reducing the example space. While research in machine learning has devoted considerable attention to such techniques, they have so far been neglected in ILP research. The purpose of this paper is to motivate research in this area and to present some results on windowing techniques. 1 Introduction One of the most often heard prejudices against ILP algorithms is that they are only applicable to toy problems and will not scale up to applications of significant size. While it is our firm belief that the order of magnitude of this unspecified "significant size" is monotonicly increasing in order to keep the argument alive, it is nevertheless indis...
More Efficient Windowing
, 1997
"... Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that windowing may often lead to a decrease in performance. In this work, we try to argue that rule learning algorithms are more appropriate for windowing ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Windowing has been proposed as a procedure for efficient memory use in the ID3 decision tree learning algorithm. However, previous work has shown that windowing may often lead to a decrease in performance. In this work, we try to argue that rule learning algorithms are more appropriate for windowing than decision tree algorithms, because the former typically learn and evaluate rules independently and are thus less susceptible to changes in class distributions. Most importantly, we present a new windowing algorithm that achieves additional gains in efficiency by saving promising rules and removing examples covered by these rules from the learning window. While the presented algorithm is only suitable for redundant, noisefree data sets, we will also briefly discuss the problem of noisy data for windowing algorithms. Introduction Windowing is a general technique that aims at improving the efficiency of inductive classification learners by identifying an appropriate subset of the given t...
Learning minesweeper with multirelational learning
 In Proc. of the 18th IJCAI
, 2003
"... Minesweeper is a oneperson game which looks deceptively easy to play, but where average human performance is far from optimal. Playing the game requires logical, arithmetic and probabilistic reasoning based on spatial relationships on the board. Simply checking a board state for consistency is an N ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Minesweeper is a oneperson game which looks deceptively easy to play, but where average human performance is far from optimal. Playing the game requires logical, arithmetic and probabilistic reasoning based on spatial relationships on the board. Simply checking a board state for consistency is an NPcomplete problem. Given the difficulty of handcrafting strategies to play this and other games, AI researchers have always been interested in automatically learning such strategies from experience. In this paper, we show that when integrating certain techniques into a general purpose learning system (Mio), the resulting system is capable of inducing a Minesweeper playing strategy that beats the winning rate of average human players. In addition, we discuss the necessary background knowledge, present experimental results demonstrating the gain obtained with our techniques and show the strategy learned for the game. 1