Results 1 - 10
of
258
Improved Heterogeneous Distance Functions
- Journal of Artificial Intelligence Research
, 1997
"... Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores cont ..."
Abstract
-
Cited by 173 (9 self)
- Add to MetaCart
Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes. 1. Introduction Instance-Based Learning (IBL) (Aha, ...
The Helmholtz Machine
, 1995
"... Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative model ..."
Abstract
-
Cited by 165 (22 self)
- Add to MetaCart
Discovering the structure inherent in a set of patterns is a fundamental aim of statistical inference or learning. One fruitful approach is to build a parameterized stochastic generative model, independent draws from which are likely to produce the patterns. For all but the simplest generative models, each pattern can be generated in exponentially many ways. It is thus intractable to adjust the parameters to maximize the probability of the observed patterns. We describe a way of finessing this combinatorial explosion by maximizing an easily computed lower bound on the probability of the observations. Our method can be viewed as a form of hierarchical self-supervised learning that may relate to the function of bottom-up and top-down cortical processing pathways.
Bidirectional Associative Memories
- IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS
, 1988
"... Stability and encoding properties of two-layer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce two-way associative search for stored associations (A, B, ). Passing information through M gives one directi ..."
Abstract
-
Cited by 138 (3 self)
- Add to MetaCart
Stability and encoding properties of two-layer nonlinear feedback neural networks are examined. Bidirectionality, forward and backard information flow, is introduced in neural nets to produce two-way associative search for stored associations (A, B, ). Passing information through M gives one direction; passing it through its transpose M r gives the other. A bidirectional associative memory. (BAM) behaves as a hetero- associative content addressable memory (CAM), storing and recalling the vector pairs (A1, Bi),-..,(Am Bin) , where .4 {0,1}"and B We prove that every n-by-p matrix M is a bidirectionally stable heteroas- sociative CAM for both binary/bipolar and continuous neurons a, and hi. When the BAM neurons are activated, the network quickly evolves to a stable state of two-pattern reverberation, or resonance. The stable reverberation corresponds to a system energy local minimum. Heteroassociafive inlormation is encoded iu a BAM by summing correlation matrices. The BAM storage capact .ty for reliable recall is roughly m < niin(n, p). No more heteroassociafive pairs can be 'reliably stored and recalled than the lesser of the dimensions of the pattern spaces (0,1 }"and 0,1 } P. The Appendix shos that it is better on average to use bipolar {- 1,i} coding than binary. {0,1 } coding of heteroassociative pairs (.4, B,). BAM encoding and decoding are combined in the adaptive BAM, which extends global bidirectional stabflit), to realtime unsupervised learning. Temporal patterns (AE,--., A,,) are represented as ordered lists of binary/bipolar vectors and stored in a temporal associative memory (TAM) n-by- matrix M as a limit cycle of the dynamical system. Forward recall proceeds through M, backward recall through M r . Temporal patterns are stored by summing contiguous bipolar...
Constructive Incremental Learning from Only Local Information
, 1998
"... ... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields. ..."
Abstract
-
Cited by 126 (35 self)
- Add to MetaCart
... This article illustrates the potential learning capabilities of purely local learning and offers an interesting and powerful approach to learning with receptive fields.
Hierarchical Bayesian Inference in the Visual Cortex
, 2002
"... this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could pot ..."
Abstract
-
Cited by 106 (0 self)
- Add to MetaCart
this paper, we propose a Bayesian theory of hierarchical cortical computation based both on (a) the mathematical and computational ideas of computer vision and pattern the- ory and on (b) recent neurophysiological experimental evidence. We ,2 have proposed that Grenander's pattern theory 3 could potentially model the brain as a generafive model in such a way that feedback serves to disambiguate and 'explain away' the earlier representa- tion. The Helmholtz machine 4, 5 was an excellent step towards approximating this proposal, with feedback implementing priors. Its development, however, was rather limited, dealing only with binary images. Moreover, its feedback mechanisms were engaged only during the learning of the feedforward connections but not during perceptual inference, though the Gibbs sampling process for inference can potentially be interpreted as top-down feedback disambiguating low level representations? Rao and Ballard's predictive coding/Kalman filter model 6 did integrate generafive feedback in the perceptual inference process, but it was primarily a linear model and thus severely limited in practical utility. The data-driven Markov Chain Monte Carlo approach of Zhu and colleagues 7, 8 might be the most successful recent application of this proposal in solving real and difficult computer vision problems using generafive models, though its connection to the visual cortex has not been explored. Here, we bring in a powerful and widely applicable paradigm from artificial intelligence and computer vision to propose some new ideas about the algorithms of visual cortical process- ing and the nature of representations in the visual cortex. We will review some of our and others' neurophysiological experimental data to lend support to these ideas
Reduction Techniques for Instance-Based Learning Algorithms
- Machine Learning
, 2000
"... . Instance-based learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main p ..."
Abstract
-
Cited by 93 (2 self)
- Add to MetaCart
. Instance-based learning algorithms are often faced with the problem of deciding which instances to store for use during generalization. Storing too many instances can result in large memory requirements and slow execution speed, and can cause an oversensitivity to noise. This paper has two main purposes. First, it provides a survey of existing algorithms used to reduce storage requirements in instance-based learning algorithms and other exemplar-based algorithms. Second, it proposes six additional reduction algorithms called DROP1--DROP5 and DEL (three of which were first described in Wilson & Martinez, 1997c, as RT1--RT3) that can be used to remove instances from the concept description. These algorithms and 10 algorithms from the survey are compared on 31 classification tasks. Of those algorithms that provide substantial storage reduction, the DROP algorithms have the highest average generalization accuracy in these experiments, especially in the presence of uniform class noise. ...
A survey of outlier detection methodologies
- Artificial Intelligence Review
, 2004
"... Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populat ..."
Abstract
-
Cited by 80 (3 self)
- Add to MetaCart
Abstract. Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review.
Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex
- Neural Computation
, 1995
"... this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expec ..."
Abstract
-
Cited by 77 (20 self)
- Add to MetaCart
this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expectation-driven top-down signals to predict current recognition state. Synaptic weights in the model are adapted in a Hebbian manner according to a learning rule also derived from the MDL principle. The resulting prediction/learning scheme can be viewed as implementing a form of the Expectation-Maximization (EM) algorithm. The architecture of the model posits an active computational role for the reciprocal connections between adjoining visual cortical areas in determining neural response properties. In particular, the model demonstrates the possible role of feedback from higher cortical areas in mediating neurophysiological effects due to stimuli from beyond the classical receptive field. Si
The Link Between Brain Learning, Attention, And Consciousness
, 1998
"... The processes whereby our brains continue to learn about a changing world in a stable fashion throughout life are proposed to lead to conscious experiences. These processes include the learning of top-down expectations, the matching of these expectations against bottom-up data, the focusing of atten ..."
Abstract
-
Cited by 65 (28 self)
- Add to MetaCart
The processes whereby our brains continue to learn about a changing world in a stable fashion throughout life are proposed to lead to conscious experiences. These processes include the learning of top-down expectations, the matching of these expectations against bottom-up data, the focusing of attention upon the expected clusters of information, and the development of resonant states between bottom-up and top-down processes as they reach an attentive consensus between what is expected and what is there in the outside world. It is suggested that all conscious states in the brain are resonant states, and that these resonant states trigger learning of sensory and cognitive representations. The models which summarize these concepts are therefore called Adaptive Resonance Theory, or ART, models. Psychophysical and neurobiological data in support of ART are presented from early vision, visual object recognition, auditory streaming, variable-rate speech perception, somatosensory perception, a...
Neuronal Architectures for Pattern-theoretic Problems
- Large-Scale Theories of the Cortex
, 1994
"... this paper is the proposition that the computational analysis of vision -- and speech, tactile sensing, motor control, etc. -- (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision a ..."
Abstract
-
Cited by 65 (1 self)
- Add to MetaCart
this paper is the proposition that the computational analysis of vision -- and speech, tactile sensing, motor control, etc. -- (the theory of the computation as Marr called it (Marr, 82)) has is reaching a point where it can provide a clearer and deeper description of the essential tasks of vision as well as a wide range of other cognitive tasks. For instance, the development of algorithms for character recognition or for face recognition or for road tracking from a moving vehicle (three problems which have been much studied on account of their potential applications) forces the researcher to deal with noisy, complex real world data. In doing this, one's initial ideas about what parts of the problem are difficult, what parts are simple, may turn out to be quite wrong. Quite often, a step which one thinks of as a simple pre-processing clean up operation turns out to be very difficult and pinpoints for you a new class of problems which had been ignored. Introspection turns out often to be very poor guide to the complexity of a problem. The reason for this, we believe, is our subjective impression of perceiving instantaneously and effortlessly the significance of sensory patterns, e.g. the word being spoken or which face is being seen. Many psychological experiments however have shown that what we perceive is not the true sensory signal, but a rational reconstruction of what the signal should be. This means that the messy ambiguous raw signal never makes it to our consciousness but gets overlaid with a clearly and precisely patterned version which could never have been computed without the extensive use of memories, expectations and logic. Only when you attempt to duplicate such a skill by computer do you discover all the hidden complexity in the computation. We believe ...

