Results 11 - 20
of
128
An Integrated Instance-Based Learning Algorithm
- Computational Intelligence
, 2000
"... The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This p ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
The basic nearest-neighbor rule generalizes well in many domains but has several shortcomings, including inappropriate distance functions, large storage requirements, slow execution time, sensitivity to noise, and an inability to adjust its decision boundaries after storing the training data. This paper proposes methods for overcoming each of these weaknesses and combines these methods into a comprehensive learning system called the Integrated Decremental Instance-Based Learning Algorithm (IDIBL) that seeks to reduce storage, improve execution speed, and increase generalization accuracy, when compared to the basic nearest neighbor algorithm and other learning models. IDIBL tunes its own parameters using a new measure of fitness that combines confidence and cross-validation (CVC) accuracy in order to avoid discretization problems with more traditional leave-one-out cross-validation (LCV). In our experiments IDIBL achieves higher generalization accuracy than other less comprehensive instance-based learning algorithms, while requiring less than onefourth the storage of the nearest neighbor algorithm and improving execution speed by a corresponding factor. In experiments on 21 datasets, IDIBL also achieves higher generalization accuracy than those reported for 16 major machine learning and neural network models.
Prototype selection for dissimilarity-based classifiers
- Pattern Recognition
, 2006
"... A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilarity-based classifiers. They construct a decision rule based on the entire training set, but they nee ..."
Abstract
-
Cited by 19 (1 self)
- Add to MetaCart
A conventional way to discriminate between objects represented by dissimilarities is the nearest neighbor method. A more efficient and sometimes a more accurate solution is offered by other dissimilarity-based classifiers. They construct a decision rule based on the entire training set, but they need just a small set of prototypes, the so-called representation set, as a reference for classifying new objects. Such alternative approaches may be especially advantageous for non-Euclidean or even non-metric dissimilarities. The choice of a proper representation set for dissimilarity-based classifiers is not yet fully investigated. It appears that a random selection may work well. In this paper, a number of experiments has been conducted on various metric and non-metric dissimilarity representations and prototype selection methods. Several procedures, like traditional feature selection methods (here effectively searching for prototypes), mode seeking and linear programming are compared to the random selection. In general, we find out that systematic approaches lead to better results than the random selection, especially for a small number of prototypes. Although there is no single winner as it depends on data characteristics, the k-centres works well, in general. For two-class problems, an important observation is that our dissimilarity-based discrimination functions relying on significantly reduced prototype sets (3–10 % of the training objects) offer a similar or much better classification accuracy than the best k-NN rule on the entire training set. This may be reached for multi-class data as well, however such problems are more difficult.
M.: Fuzzy rule-based systems derived from similarity to prototypes
- Lecture Notes in Computer Science. Volume 3316
, 2004
"... Abstract — Relations between similarity-based systems, evaluating similarity to some prototypes, and fuzzy rule-based systems, aggregating values of membership functions, are investigated. Similarity measures based on information theory and probabilistic distance functions lead to a new type of memb ..."
Abstract
-
Cited by 18 (12 self)
- Add to MetaCart
Abstract — Relations between similarity-based systems, evaluating similarity to some prototypes, and fuzzy rule-based systems, aggregating values of membership functions, are investigated. Similarity measures based on information theory and probabilistic distance functions lead to a new type of membership functions applicable to symbolic data. Fuzzy membership functions on the other hand lead to a new type of distance functions. Several such novel functions are presented. This approach opens new ways to generate fuzzy rules based either on individual features or on their combinations used to evaluate similarity. Transition from prototype-based rules using similarity and fuzzy rules is illustrated using artificial data in two dimensions. As an illustration of usefulness of prototype-based rules very simple rules are derived for leukemia gene expression data. I.
Data Compression and Local Metrics For Nearest Neighbour Classification
, 1997
"... A local distance measure for the nearest neighbor classification rule is shown to achieve high compression rates and high accuracy on real data sets. In the approach ..."
Abstract
-
Cited by 16 (5 self)
- Add to MetaCart
A local distance measure for the nearest neighbor classification rule is shown to achieve high compression rates and high accuracy on real data sets. In the approach
Classification, Association and Pattern Completion Using Neural Similarity Based Methods
- APPLIED MATH. & COMP. SCIENCE
, 2000
"... A framework for Similarity-Based Methods (SBMs) includes many classification models as special cases: neural network of the Radial Basis Function Networks type, Feature Space Mapping neurofuzzy networks based on separable transfer functions, Learning Vector Quantization, variants of the k nearest ne ..."
Abstract
-
Cited by 16 (15 self)
- Add to MetaCart
A framework for Similarity-Based Methods (SBMs) includes many classification models as special cases: neural network of the Radial Basis Function Networks type, Feature Space Mapping neurofuzzy networks based on separable transfer functions, Learning Vector Quantization, variants of the k nearest neighbor methods and several new models that may be presented in a network form. Multilayer Perceptrons (MLPs) use scalar products to compute weighted activation of neurons, combining soft hyperplanes to provide decision borders. Distance-based multilayer perceptrons (D-MLPs) evaluate similarity of inputs to weights offering a natural generalization of standard MLPs. Cluster-based initialization procedure determining architecture and values of all adaptive parameters is described. Networks
Evolutionary Learning of Hierarchical Decision Rules
- IEEE Transactions on Systems, Man and Cybernetics, Part B
, 2003
"... This paper describes an approach based on evolutionary algorithms, hierarchical decision rules (HIDER), for learning rules in continuous and discrete domains. The algorithm produces a hierarchical set of rules, that is, the rules are sequentially obtained and must be, therefore, tried in order until ..."
Abstract
-
Cited by 15 (9 self)
- Add to MetaCart
This paper describes an approach based on evolutionary algorithms, hierarchical decision rules (HIDER), for learning rules in continuous and discrete domains. The algorithm produces a hierarchical set of rules, that is, the rules are sequentially obtained and must be, therefore, tried in order until one is found whose conditions are satisfied. Thus, the number of rules may be reduced because the rules could be inside one another. The evolutionary algorithm uses both real and binary coding for the individuals of the population. We have tested our system on real data from the UCI Repository, and the results of a ten-fold cross-validation are compared to C4.5s, C4.5Rules, See5s, and See5Rules. The experiments show that HIDER works well in practice.
Using CBR to select solution strategies in constraint programming
- In ICCBR
, 2005
"... Abstract. Constraint programming is a powerful paradigm that offers many different strategies for solving problems. Choosing a good strategy is difficult; choosing a poor strategy wastes resources and may result in a problem going unsolved. We show how Case-Based Reasoning can be used to select good ..."
Abstract
-
Cited by 13 (0 self)
- Add to MetaCart
Abstract. Constraint programming is a powerful paradigm that offers many different strategies for solving problems. Choosing a good strategy is difficult; choosing a poor strategy wastes resources and may result in a problem going unsolved. We show how Case-Based Reasoning can be used to select good strategies. We design experiments which demonstrate that, on two problems with quite different characteristics, CBR can outperform four other strategy selection techniques. 1
Informal Identification of Outliers in Medical Data
, 2000
"... . Informal box plot identification of outliers in realworld medical data was studied. Box plots were used to detect univariate outliers directly whereas the box plotted Mahalanobis distances identified multivariate outliers. Vertigo and female urinary incontinence data were used in the tests. The re ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
. Informal box plot identification of outliers in realworld medical data was studied. Box plots were used to detect univariate outliers directly whereas the box plotted Mahalanobis distances identified multivariate outliers. Vertigo and female urinary incontinence data were used in the tests. The removal of outliers increased the descriptive classification accuracy of discriminant analysis functions and nearest neighbour method, while the predictive ability of these methods reduced somewhat. Outliers were also evaluated subjectively by expert physicians, who found most of the multivariate outliers to truly be outliers in their area. The experts sometimes disagreed with the method on univariate outliers. This happened, for example, in heterogeneous diagnostic groups where also extreme values are natural. The informal method may be used for straightforward identification of suspicious data or as a tool to collect abnormal cases for an in-depth analysis. 1 INTRODUCTION There are many de...
Inductive Learning for Case-Based Diagnosis with Multiple Faults
- In Advances in Case-Based Reasoning, volume 2416 of LNAI
, 2002
"... We present adapted inductive methods for learning similarities, parameter weights and diagnostic profiles for case-based reasoning. All of these methods can be refined incrementally by applying different types of background knowledge. Diagnostic profiles are used for extending the conventional CBR t ..."
Abstract
-
Cited by 10 (7 self)
- Add to MetaCart
We present adapted inductive methods for learning similarities, parameter weights and diagnostic profiles for case-based reasoning. All of these methods can be refined incrementally by applying different types of background knowledge. Diagnostic profiles are used for extending the conventional CBR to solve cases with multiple faults. The context of our work is to supplement a medical documentation and consultation system by CBR techniques, and we present an evaluation with a real-world case base.
A Minimum Risk Metric for Nearest Neighbor Classification
- In Proc. 16th International Conf. on Machine Learning
, 1999
"... Nearest Neighbor is a well-known algorithm extensively studied by the Pattern Recognition and Machine Learning communities and widely exploited in Case Based Reasoning applications. The notion of metric is central to Nearest Neighbor's working and different feature weighting metrics have been ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Nearest Neighbor is a well-known algorithm extensively studied by the Pattern Recognition and Machine Learning communities and widely exploited in Case Based Reasoning applications. The notion of metric is central to Nearest Neighbor's working and different feature weighting metrics have been proposed in order to increase its performance. In this work we present an original Probability Based Metric, i.e. a metric for classification tasks that relies on estimates of the posterior probabilities, called Minimum Risk Metric (MRM). MRM is optimal but it optimizes directly the finite misclassification risk whereas the Short and Fukunaga Metric minimize the difference between finite risk and asymptotic risk. An experimental comparison of MRM with Short and Fukunaga Metric, Value Difference Metric, and Euclidean--Hamming metrics on benchmark datasets shows that MRM outperforms the other metrics and performs comparably to the Bayes Classifier based on the same probability...

