Results 1 - 10
of
76
A Perspective on Inductive Logic Programming
"... . The state-of-the-art in inductive logic programming is surveyed by analyzing the approach taken by this field over the past 8 years. The analysis investigates the roles of 1) logic programming and machine learning, of 2) theory, techniques and applications, of 3) various technical problems address ..."
Abstract
-
Cited by 40 (7 self)
- Add to MetaCart
. The state-of-the-art in inductive logic programming is surveyed by analyzing the approach taken by this field over the past 8 years. The analysis investigates the roles of 1) logic programming and machine learning, of 2) theory, techniques and applications, of 3) various technical problems addressed within inductive logic programming. 1 Introduction The term inductive logic programming was first coined by Stephen Muggleton in 1990 [1]. Inductive logic programming is concerned with the study of inductive machine learning within the representations offered by computational logic. Since 1991, annual international workshops have been organized [2-8]. This paper is an attempt to analyze the developments within this field. Particular attention is devoted to the relation between inductive logic programming and its neighboring fields such as machine learning, computational logic and data mining, and to the role that theory, techniques and implementations, and applications play. The analysis...
Subgroup Discovery with CN2-SD
- Journal of Machine Learning Research
, 2004
"... discovery. The goal of subgroup discovery is to find rules describing subsets of the population that are sufficiently large and statistically unusual. The paper presents a subgroup discovery algorithm, CN2-SD, developed by modifying parts of the CN2 classification rule learner: its covering algorit ..."
Abstract
-
Cited by 34 (7 self)
- Add to MetaCart
discovery. The goal of subgroup discovery is to find rules describing subsets of the population that are sufficiently large and statistically unusual. The paper presents a subgroup discovery algorithm, CN2-SD, developed by modifying parts of the CN2 classification rule learner: its covering algorithm, search heuristic, probabilistic classification of instances, and evaluation measures. Experimental evaluation of CN2-SD on 23 UCI data sets shows substantial reduction of the number of induced rules, increased rule coverage and rule significance, as well as slight improvements in terms of the area under ROC curve, when compared with the CN2 algorithm. Application of CN2-SD to a large traffic accident data set confirms these findings.
Comparative Evaluation of Approaches to Propositionalization
, 2003
"... Propositionalization has already been shown to be a promising approach for robustly and e#ectively handling relational data sets for knowledge discovery. In this paper, we compare up-to-date methods for propositionalization from two main groups: logic-oriented and databaseoriented techniques. Ex ..."
Abstract
-
Cited by 33 (2 self)
- Add to MetaCart
Propositionalization has already been shown to be a promising approach for robustly and e#ectively handling relational data sets for knowledge discovery. In this paper, we compare up-to-date methods for propositionalization from two main groups: logic-oriented and databaseoriented techniques. Experiments using several learning tasks --- both ILP benchmarks and tasks from recent international data mining competitions --- show that both groups have their specific advantages. While logic-oriented methods can handle complex background knowledge and provide expressive first-order models, database-oriented methods can be more e#cient especially on larger data sets. Obtained accuracies vary such that a combination of the features produced by both groups seems a further valuable venture.
Discovery of Relational Association Rules
- Relational data mining
, 2000
"... Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples. ..."
Abstract
-
Cited by 25 (0 self)
- Add to MetaCart
Within KDD, the discovery of frequent patterns has been studied in a variety of settings. In its simplest form, known from association rule mining, the task is to discover all frequent item sets, i.e., all combinations of items that are found in a sufficient number of examples.
Expert-Guided Subgroup Discovery: Methodology and Application
- Journal of Artificial Intelligence Research
, 2002
"... This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The othe ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
This paper presents an approach to expert-guided subgroup discovery. The main step of the subgroup discovery process, the induction of subgroup descriptions, is performed by a heuristic beam search algorithm, using a novel parametrized definition of rule quality which is analyzed in detail. The other important steps of the proposed subgroup discovery process are the detection of statistically significant properties of selected subgroups and subgroup visualization: statistically significant properties are used to enrich the descriptions of induced subgroups, while the visualization shows subgroup properties in the form of distributions of the numbers of examples in the subgroups. The approach is illustrated by the results obtained for a medical problem of early detection of patient risk groups.
Confirmation-guided discovery of first-order rules with Tertius
- Machine Learning
, 2000
"... . This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, mul ..."
Abstract
-
Cited by 23 (9 self)
- Add to MetaCart
. This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal bestfirst search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal eithe...
Exploiting Background Knowledge for Knowledge-Intensive Subgroup Discovery
- In: Proc. 19th Intl. Joint Conference on Artificial Intelligence (IJCAI-05
, 2005
"... In general, knowledge-intensive data mining methods exploit background knowledge to improve the quality of their results. Then, in knowledge-rich domains often the interestingness of the mined patterns can be increased significantly. In this paper we categorize several classes of background knowledg ..."
Abstract
-
Cited by 22 (15 self)
- Add to MetaCart
In general, knowledge-intensive data mining methods exploit background knowledge to improve the quality of their results. Then, in knowledge-rich domains often the interestingness of the mined patterns can be increased significantly. In this paper we categorize several classes of background knowledge for subgroup discovery, and present how the necessary knowledge elements can be modelled. Furthermore, we show how subgroup discovery methods benefit from the utilization of background knowledge, and discuss its application in an incremental process-model. The context of our work is to identify interesting diagnostic patterns to supplement a medical documentation and consultation system. We provide a case study in the medical domain, using a case base from a realworld application. 1
RSD: Relational subgroup discovery through first-order feature construction
- In 12th International Conference on Inductive Logic Programming
, 2002
"... Relational rule learning is typically used in solving classification and prediction tasks. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule l ..."
Abstract
-
Cited by 20 (6 self)
- Add to MetaCart
Relational rule learning is typically used in solving classification and prediction tasks. However, relational rule learning can be adapted also to subgroup discovery. This paper proposes a propositionalization approach to relational subgroup discovery, achieved through appropriately adapting rule learning and first-order feature construction.
Finding the Most Interesting Patterns in a Database Quickly by Using Sequential Sampling
- Journal of Machine Learning Research
, 2001
"... Many discovery problems, e.g., subgroup or association rule discovery, can naturally be cast as n-best hypotheses problems where the goal is to nd the n hypotheses from a given hypothesis space that score best according to a certain utility function. We present a sampling algorithm that solves this ..."
Abstract
-
Cited by 19 (2 self)
- Add to MetaCart
Many discovery problems, e.g., subgroup or association rule discovery, can naturally be cast as n-best hypotheses problems where the goal is to nd the n hypotheses from a given hypothesis space that score best according to a certain utility function. We present a sampling algorithm that solves this problem by issuing a small number of database queries while guaranteeing precise bounds on con dence and quality of solutions. Known sampling approaches have treated single hypothesis selection problems, assuming that the utility be the average (over the examples) of some function | which is not the case for many frequently used utility functions. We show that our algorithm works for all utilities that can be estimated with bounded error. We provide these error bounds and resulting worst-case sample bounds for some of the most frequently used utilities, and prove that there is no sampling algorithm for a popular class of utility functions that cannot be estimated with bounded error. The algorithm is sequential in the sense that it starts to return (or discard) hypotheses that already seem to be particularly good (or bad) after a few examples. Thus, the algorithm is almost always faster than its worst-case bounds.
Multi-Relational Decision Tree Induction
- In Proceedings of PKDD’ 99, Prague, Czech Republic, Septembre
, 1999
"... Discovering decision trees is an important set of techniques in KDD, both because of their simple interpretation and the efficiency of their discovery. One of their disadvantages is that they do not take the structure of the mining object into account. By going from the standard single-relation appr ..."
Abstract
-
Cited by 18 (0 self)
- Add to MetaCart
Discovering decision trees is an important set of techniques in KDD, both because of their simple interpretation and the efficiency of their discovery. One of their disadvantages is that they do not take the structure of the mining object into account. By going from the standard single-relation approach to the multi-relational approach as in ILP this disadvantage is removed. However, the straightforward generalization loses the efficiency of the standard algorithms. In this paper we present a framework that allows the efficient discovery of multi-relational decision trees through the exploitation of the domain knowledge encoded in the data model of the database. Introduction The induction of decision trees has been getting a lot of attention in the field of Knowledge Discovery in Databases over the past few years. This popularity has been largely due to the efficiency with which decision trees can be induced from large datasets, as well as to the elegant and intuitive representation ...

