Results 1 - 10
of
148
A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms
- ARTIFICIAL INTELLIGENCE REVIEW
, 1997
"... Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN v ..."
Abstract
-
Cited by 94 (0 self)
- Add to MetaCart
Many lazy learning algorithms are derivatives of the k-nearest neighbor (k-NN) classifier, which uses a distance function to generate predictions from stored instances. Several studies have shown that k-NN's performance is highly sensitive to the definition of its distance function. Many k-NN variants have been proposed to reduce this sensitivity by parameterizing the distance function with feature weights. However, these variants have not been categorized nor empirically compared. This paper reviews a class of weight-setting methods for lazy learning algorithms. We introduce a framework for distinguishing these methods and empirically compare them. We observed four trends from our experiments and conducted further studies to highlight them. Our results suggest that methods which use performance feedback to assign weight settings demonstrated three advantages over other methods: they require less pre-processing, perform better in the presence of interacting features, and generally require less training data to learn good settings. We also found that continuous weighting methods tend to outperform feature selection algorithms for tasks where some features are useful but less important than others.
An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrectangle Algorithms
- Machine Learning
, 1995
"... . Algorithms based on Nested Generalized Exemplar (NGE) theory (Salzberg, 1991) classify new data points by computing their distance to the nearest "generalized exemplar" (i.e., either a point or an axis-parallel rectangle). They combine the distance-based character of nearest neighbor (NN) classifi ..."
Abstract
-
Cited by 81 (5 self)
- Add to MetaCart
. Algorithms based on Nested Generalized Exemplar (NGE) theory (Salzberg, 1991) classify new data points by computing their distance to the nearest "generalized exemplar" (i.e., either a point or an axis-parallel rectangle). They combine the distance-based character of nearest neighbor (NN) classifiers with the axis-parallel rectangle representation employed in many rulelearning systems. An implementation of NGE was compared to the k-nearest neighbor (kNN) algorithm in 11 domains and found to be significantly inferior to kNN in 9 of them. Several modifications of NGE were studied to understand the cause of its poor performance. These show that its performance can be substantially improved by preventing NGE from creating overlapping rectangles, while still allowing complete nesting of rectangles. Performance can be further improved by modifying the distance metric to allow weights on each of the features (Salzberg, 1991). Best results were obtained in this study when the weights were co...
The Link Between Brain Learning, Attention, And Consciousness
, 1998
"... The processes whereby our brains continue to learn about a changing world in a stable fashion throughout life are proposed to lead to conscious experiences. These processes include the learning of top-down expectations, the matching of these expectations against bottom-up data, the focusing of atten ..."
Abstract
-
Cited by 65 (28 self)
- Add to MetaCart
The processes whereby our brains continue to learn about a changing world in a stable fashion throughout life are proposed to lead to conscious experiences. These processes include the learning of top-down expectations, the matching of these expectations against bottom-up data, the focusing of attention upon the expected clusters of information, and the development of resonant states between bottom-up and top-down processes as they reach an attentive consensus between what is expected and what is there in the outside world. It is suggested that all conscious states in the brain are resonant states, and that these resonant states trigger learning of sensory and cognitive representations. The models which summarize these concepts are therefore called Adaptive Resonance Theory, or ART, models. Psychophysical and neurobiological data in support of ART are presented from early vision, visual object recognition, auditory streaming, variable-rate speech perception, somatosensory perception, a...
Extracting Comprehensible Models from Trained Neural Networks
, 1996
"... To Mom, Dad, and Susan, for their support and encouragement. ..."
Abstract
-
Cited by 65 (4 self)
- Add to MetaCart
To Mom, Dad, and Susan, for their support and encouragement.
Representation is Representation of Similarities
- Behavioral and Brain Sciences
, 1996
"... Advanced perceptual systems are faced with the problem of securing a principled relationship between the world and its internal representation. I propose a unified approach to visual representation, based on Shepard's (1968) notion of second-order isomorphism. According to the proposed theory, a sha ..."
Abstract
-
Cited by 60 (15 self)
- Add to MetaCart
Advanced perceptual systems are faced with the problem of securing a principled relationship between the world and its internal representation. I propose a unified approach to visual representation, based on Shepard's (1968) notion of second-order isomorphism. According to the proposed theory, a shape is represented internally by the responses of a few tuned modules, each of which is broadly selective for some reference shape, whose similarity to the stimulus it measures. The result is a philosophically appealing, computationally feasible, biologically credible, and formally veridical representation of a distal shape space. This approach supports representation of and discrimination among shapes radically different from the reference ones, while bypassing the need for the computationally problematic decomposition into parts; it also addresses the needs of shape categorization, and can be used to derive a range of models of perceived similarity. Representation is Representation of Sim...
Fast learning VIEWNET architectures for recognizing 3D objects from multiple 2-D views.” Neural Networks
, 1995
"... Abstract--The recognition of three-dimensional ( 3-D) objects from sequences of their two-dimensional ( 2-D) views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a com ..."
Abstract
-
Cited by 46 (12 self)
- Add to MetaCart
Abstract--The recognition of three-dimensional ( 3-D) objects from sequences of their two-dimensional ( 2-D) views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-1) view categories whose outputs are combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by accumulating evidence from 3-D object category nodes us multiple 2-D views are experienced. The simplest VIEWNET achieves high recognition scores without the need to explicitly code the temporal order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Variants of the VIEWNET architecture may be used for scene understanding by using a preprocessor and classifier that can determine both what objects are in a scene and where they are located. The present VIEWNET preprocessor includes the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and suppresses image noise. This boundary segmentation is rendered invariant under 2-D translation, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaassian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. These compressed codes are input into the
The Hippocampus And Cerebellum In Adaptively Timed Learning, Recognition, And Movement
, 1995
"... The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors ..."
Abstract
-
Cited by 45 (25 self)
- Add to MetaCart
The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors. To coordinate these processes, the hippocampal formation and cerebellum each contain circuits that learn to adaptively time their outputs. Within the model, hippocampal timing helps to maintain attention on motivationally salient goal objects during variable task-related delays, and cerebellar timing controls the release of conditioned responses. This property is part of the model's description of how cognitive-emotional interactions focus attention on motivationally valued cues, and how this process breaks down due to hippocampal ablation. The model suggests that the hippocampal mechanisms that help to rapidly draw attention to salient cues could prematurely release motor commands were no...
Co-evolving Intertwined Spirals
- in Proceedings of the Fifth Annual Conference on Evolutionary Programming
, 1996
"... We recently solved the two spirals problem, a difficult neural network benchmark classification problem, using the genetic programming primitives set up by [Koza, 1992]. Instead of using absolute fitness, we use a relative fitness based on a competition for coverage of the data set. This is a form o ..."
Abstract
-
Cited by 42 (14 self)
- Add to MetaCart
We recently solved the two spirals problem, a difficult neural network benchmark classification problem, using the genetic programming primitives set up by [Koza, 1992]. Instead of using absolute fitness, we use a relative fitness based on a competition for coverage of the data set. This is a form of co-evolutionary search because the fitness function changes with the population. Because niches are opened by proportionate reproduction, rather than crowded out, and because of the crossover operator, we find solutions which have a nice modular structure. Our experiments used our Massively Parallel Genetic Programming (MPGP) system running on a SIMD machine of 4096 processors, the Maspar MP-2.
A Survey of Fuzzy Clustering Algorithms for Pattern Recognition
, 1998
"... Clustering algorithms aim at modelling fuzzy (i.e., ambiguous) unlabeled patterns efficiently. Our goal is to propose a theoretical framework where clustering systems can be compared on the basis of their learning strategies. In the first part of this work, the following issues are reviewed: relativ ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
Clustering algorithms aim at modelling fuzzy (i.e., ambiguous) unlabeled patterns efficiently. Our goal is to propose a theoretical framework where clustering systems can be compared on the basis of their learning strategies. In the first part of this work, the following issues are reviewed: relative (probabilistic) and absolute (possibilistic) fuzzy membership functions and their relationships to the Bayes rule, batch and on-line learning, growing and pruning networks, modular network architectures, topologically perfect mapping, ecological nets and neuro-fuzziness. From this discussion an equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed as a unifying framework in the comparison of clustering systems. Moreover, a set of functional attributes is selected for use as dictionary entries in our comparison. In the second part of this paper, five clustering algorithms taken from the literature are reviewed and compared on...
A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm
- in the Proceedings of the 7th European Conference on Machine Learning
, 1994
"... . Algorithms based on Nested Generalized Exemplar (NGE) theory [10] classify new data points by computing their distance to the nearest "generalized exemplar" (i.e. an axis-parallel multidimensional rectangle). An improved version of NGE, called BNGE, was previously shown to perform comparably to th ..."
Abstract
-
Cited by 33 (0 self)
- Add to MetaCart
. Algorithms based on Nested Generalized Exemplar (NGE) theory [10] classify new data points by computing their distance to the nearest "generalized exemplar" (i.e. an axis-parallel multidimensional rectangle). An improved version of NGE, called BNGE, was previously shown to perform comparably to the Nearest Neighbor algorithm. Advantages of the NGE approach include compact representation of the training data and fast training and classification. A hybrid method that combines BNGE and the k-Nearest Neighbor algorithm, called KBNGE, is introduced for improved classification accuracy. Results from eleven domains show that KBNGE achieves generalization accuracies similar to the k-Nearest Neighbor algorithm at improved classification speed. KBNGE is a fast and easy to use inductive learning algorithm that gives very accurate predictions in a variety of domains and represents the learned knowledge in a manner that can be easily interpreted by the user. 1 Introduction Salzberg [10] describe...

