Results 11 - 20
of
30
Instance-Based Learning: Nearest Neighbour with Generalisation
, 1995
"... Instance-based learning is a machine learning method that classifies new examples by comparing them to those already seen and in memory. There are two types of instance-based learning; nearest neighbour and case-based reasoning. Of these two methods, nearest neighbour fell into disfavour during the ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
Instance-based learning is a machine learning method that classifies new examples by comparing them to those already seen and in memory. There are two types of instance-based learning; nearest neighbour and case-based reasoning. Of these two methods, nearest neighbour fell into disfavour during the 1980s, but regained popularity recently due to its simplicity and ease of implementation. Nearest neighbour learning is not without problems. It is difficult to define a distance function that works well for both discrete and continuous attributes. Noise and irrelevant attributes also pose problems. Finally, the specificity bias adopted by instance-based learning, while often an advantage, can over-represent small rules at the expense of more general concepts, leading to a marked decrease in classification performance for some domains. Generalised exemplars offer a solution. Examples that share the same class are grouped together, and so represent large rules more fully. This reduces the rol...
Data Fitting with Rule-Based Regression
- In Proceedings of the 2nd international workshop on Artificial Intelligence Techniques (AIT'95
, 1995
"... . In the classical regression theory we try to build one functional model to fit a set of data. In noisy and complex domains this methodology can be highly unreliable and/or demand too complex functional models. Piecewise regression models provide means to overcome these difficulties. Some existing ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
. In the classical regression theory we try to build one functional model to fit a set of data. In noisy and complex domains this methodology can be highly unreliable and/or demand too complex functional models. Piecewise regression models provide means to overcome these difficulties. Some existing approaches to piecewise regression are based on regression trees. However, rules are known to be more powerful descriptive languages than trees. This paper describes the rule learning system R R 2 2 . This system learns a set of regression rules from a classical machine learning data set. Regression rules are IF-THEN rules that have regression models in the conclusion. The conditional part of these rules determines the domain of applicability of the respective model. We believe that by adopting a rule-based formalism, R R 2 2 will out-perform regression trees. The initial set of experiments that we have conducted in artificial data sets show that R R 2 2 compares reasonably to other ma...
Exemplar-Based Reasoning in Geological Prospect Appraisal
- TURING INSTITUTE
, 1989
"... This paper describes a prototype system for the prediction of parameters associated with a geological prospect, required for assessing the likelihood of hydrocarbons at the prospect. A principle characteristic of expert reasoning in this domain involves the recall and use of previous examples (`e ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
This paper describes a prototype system for the prediction of parameters associated with a geological prospect, required for assessing the likelihood of hydrocarbons at the prospect. A principle characteristic of expert reasoning in this domain involves the recall and use of previous examples (`exemplars') of similar, already-drilled wells in addition to the use of general rules concerning known parameters of the prospect. Such
Learning from Imperfect Data
- IN MACHINE LEARNING, META-REASONING AND LOGICS, P. BRAZDIL AND K.KONOLIGE (EDS
, 1990
"... Systems interacting with real-world data must address the issues raised by the possible presence of errors in the observations it makes. In this paper we first present a framework for discussing imperfect data and the resulting problems it may cause. We ..."
Abstract
-
Cited by 6 (2 self)
- Add to MetaCart
Systems interacting with real-world data must address the issues raised by the possible presence of errors in the observations it makes. In this paper we first present a framework for discussing imperfect data and the resulting problems it may cause. We
Rule induction for subgroup discovery with CN2-SD
- 2nd Int. Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and MetaLearning
, 2002
"... Abstract. Rule learning is typically used in solving classification and prediction tasks. However, learning of classification rules can be adapted also to subgroup discovery. This paper shows how this can be achieved by modifying the CN2 rule learning algorithm. Modifications include a new covering ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
Abstract. Rule learning is typically used in solving classification and prediction tasks. However, learning of classification rules can be adapted also to subgroup discovery. This paper shows how this can be achieved by modifying the CN2 rule learning algorithm. Modifications include a new covering algorithm (weighted covering algorithm), a new search heuristic (weighted relative accuracy), probabilistic classification of instances, and a new measure for evaluating the results of subgroup discovery (area under ROC curve). The main advantage of the proposed approach is that each rule with high weighted accuracy represents a ‘chunk ’ of knowledge about the problem, due to the appropriate tradeoff between accuracy and coverage, achieved through the use of the weighted relative accuracy heuristic. Moreover, unlike the classical covering algorithm, in which only the first few induced rules may be of interest as subgroup descriptors with sufficient coverage (since subsequently induced rules are induced from biased example subsets), the subsequent rules induced by the weighted covering algorithm allow for discovering interesting subgroup properties of the entire population. Experimental results on 17 UCI datasets are very promising, demonstrating big improvements in number of induced rules, rule coverage and rule significance, as well as smaller improvements in rule accuracy and area under ROC curve. 1
Inclusive pruning: A new class of pruning rule for unordered search and its application to classification learning.
- In Proceedings of the Nineteenth Australasian Computer Science Conference
, 1996
"... This paper presents a new class of pruning rule for unordered search. Previous pruning rules for unordered search identify operators that should not be applied in order to prune nodes reached via those operators. In contrast, the new pruning rules identify operators that should be applied and prune ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
This paper presents a new class of pruning rule for unordered search. Previous pruning rules for unordered search identify operators that should not be applied in order to prune nodes reached via those operators. In contrast, the new pruning rules identify operators that should be applied and prune nodes that are not reached via those operators. Specific pruning rules employing both these approaches are identified for classification learning. Experimental results demonstrate that application of the new pruning rules can reduce by more than 60% the number of states from the search space that are considered during classification learning.
A Discrete Approach To Constructive Neural Network Learning
- Neural, Parallel and Scientific Computations
, 1995
"... Constructive algorithms have the objectives of improved generalization and simplified learning through dynamic creation of a problem-specific neural network architecture. Here, a parallel learning algorithm which constructs such an architecture is proposed. The algorithm consists of three phases: se ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Constructive algorithms have the objectives of improved generalization and simplified learning through dynamic creation of a problem-specific neural network architecture. Here, a parallel learning algorithm which constructs such an architecture is proposed. The algorithm consists of three phases: search through examples for points near the decision boundary; generation of a pool of candidate hyperplanes for boundary approximation; and selection of the separating hyperplanes from the candidate pool. The form of the final architecture is specified by the cardinality of the selected set of hyperplanes, where each individual hyperplane determines connection strengths for one hidden unit of the constructed network. While the algorithm might be too computationally demanding for a sequential implementation, the analytical expressions show that speed-up linear in the number of processors is achievable on distributed or highly parallel systems. The experimental benchmark results on a distribute...
Improving Image Classification by Combining Statistical, Case-Based and Model-Based Prediction Methods
- Fundamenta Informatica
, 1996
"... . Evidence for image classification can be considered to come from two sources: traditional statistical information derived algorithmically from image data, and modelbased evidence arising from previous expertise and experience in a given application domain. This paper presents a study of classifica ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
. Evidence for image classification can be considered to come from two sources: traditional statistical information derived algorithmically from image data, and modelbased evidence arising from previous expertise and experience in a given application domain. This paper presents a study of classification techniques based on both these sources (traditional algorithmic and model-based), and illustrates how they can be combined. A prototype image classification system, called Cabaress, has been constructed which implements these methods. We evaluate Cabaress as applied to the problem of identifying crops in agricultural fields, based on classifying image segments extracted from radar image data. Our results demonstrate this mixed-method approach can achieve improved classificational accuracy. 1 Introduction Following the successful launch of RADARSAT, the research activities in the application of synthetic aperture radar (SAR) have been increased significantly. Synthetic apreture radar i...
Quinqueton: Robust k-DNF Learning via Inductive Belief Merging
- ECML
"... Abstract. A central issue in logical concept induction is the prospect of inconsistency. This problem may arise due to noise in the training data, or because the target concept does not fit the underlying concept class. In this paper, we introduce the paradigm of inductive belief merging which handl ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract. A central issue in logical concept induction is the prospect of inconsistency. This problem may arise due to noise in the training data, or because the target concept does not fit the underlying concept class. In this paper, we introduce the paradigm of inductive belief merging which handles this issue within a uniform framework. The key idea is to base learning on a belief merging operator that selects the concepts which are as close as possible to the set of training examples. From a computational perspective, we apply this paradigm to robust k-DNF learning. To this end, we develop a greedy algorithm which approximates the optimal concepts to within a logarithmic factor. The time complexity of the algorithm is polynomial in the size of k. Moreover, the method bidirectional and returns one maximally specific concept and one maximally general concept. We present experimental results showing the effectiveness of our algorithm on both nominal and numerical datasets. 1
Relational subgroup discovery for gene expression data mining
- In EMBEC: 3rd IFMBE European Medical & Biological Engineering Conf
, 2005
"... Abstract: We propose a methodology for predictive classification from gene expression data, able to combine the robustness of highdimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combin ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract: We propose a methodology for predictive classification from gene expression data, able to combine the robustness of highdimensional statistical classification methods with the comprehensibility and interpretability of simple logic-based models. We first construct a robust classifier combining contributions of a large number of gene expression values, and then search for compact summarizations of subgroups among genes associated in the classifier with a given class. The subgroups are described by means of relational logic features extracted from publicly available gene annotations. The curse of dimensionality pertaining to the gene expression based classification problem due to the large number of attributes (genes) is turned into an advantage in the secondary subgroup discovery task, as here the original attributes become learning examples.

