Results 1 -
3 of
3
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
- Machine Learning
, 1993
"... In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t ..."
Abstract
-
Cited by 249 (3 self)
- Add to MetaCart
In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior ...
Metrics for Nearest Neighbour Discrimination with Categorical Attributes
- In Proceedings of the Seventh International Annual International Conference of the British Computer Society Specialist Group on Expert Systems (ES 97
, 1997
"... A fresh look is taken at the Value Difference Metric (VDM) which was proposed by Stanfill & Waltz (1986). This was originally developed to replace the Hamming distance for use in memory-based classification in problem domains where all the attributes are symbolic. It is concluded that the basic idea ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
A fresh look is taken at the Value Difference Metric (VDM) which was proposed by Stanfill & Waltz (1986). This was originally developed to replace the Hamming distance for use in memory-based classification in problem domains where all the attributes are symbolic. It is concluded that the basic idea of using such a metric is an ingenious one that would be expected to improve the performance of a nearest neighbour algorithm by virtue of its intrinsic weighting properties. Unfortunately, simulation shows that there is some doubt over the suitability of this metric because of its bias in favour of asymmetric value probabilities. However, it is explained that, statistically speaking, the VDM is really a measure of association between class and a pair of attribute values. It is shown that certain other association measures (namely H_T and chi^2/N) could be used in the same sort of way as the VDM for the purpose of providing distance metrics for use in nearest neighbour algorithms with symbolic attributes and that these measures are not biased.
Geometry-Based Learning Algorithms
, 1993
"... We present CHILS , the Convex Hull Inductive Learning System, a novel supervised learning algorithm based on approximating concepts with sets of convex hulls. We introduce a theoretical methodology for describing the power of a concept representation language and use it to compare convex hulls with ..."
Abstract
- Add to MetaCart
We present CHILS , the Convex Hull Inductive Learning System, a novel supervised learning algorithm based on approximating concepts with sets of convex hulls. We introduce a theoretical methodology for describing the power of a concept representation language and use it to compare convex hulls with other geometrical concept representations. The Domain Transform framework (DT) provides a clear way to compare the power of supervised learning systems, allowing us to characterize a class of domains which is learnable by some systems but cannot be learned by other systems. DT can be used similarly to compare the expected generalization performance of different domains. 1 Introduction When performing studies of supervised machine learning algorithms, and when designing new algorithms in particular, it is important to keep in mind the distinction between the learning element and the performance element. Regardless of the intricacies of the learning element, mathematically the performance ele...

