Results 1 - 10
of
13
Fast learning VIEWNET architectures for recognizing 3D objects from multiple 2-D views.” Neural Networks
, 1995
"... Abstract--The recognition of three-dimensional ( 3-D) objects from sequences of their two-dimensional ( 2-D) views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a com ..."
Abstract
-
Cited by 46 (12 self)
- Add to MetaCart
Abstract--The recognition of three-dimensional ( 3-D) objects from sequences of their two-dimensional ( 2-D) views is modeled by a family of self-organizing neural architectures, called VIEWNET, that use View Information Encoded With NETworks. VIEWNET incorporates a preprocessor that generates a compressed but 2-D invariant representation of an image, a supervised incremental learning system that classifies the preprocessed representations into 2-1) view categories whose outputs are combined into 3-D invariant object categories, and a working memory that makes a 3-D object prediction by accumulating evidence from 3-D object category nodes us multiple 2-D views are experienced. The simplest VIEWNET achieves high recognition scores without the need to explicitly code the temporal order of 2-D views in working memory. Working memories are also discussed that save memory resources by implicitly coding temporal order in terms of the relative activity of 2-D view category nodes, rather than as explicit 2-D view transitions. Variants of the VIEWNET architecture may be used for scene understanding by using a preprocessor and classifier that can determine both what objects are in a scene and where they are located. The present VIEWNET preprocessor includes the CORT-X 2 filter, which discounts the illuminant, regularizes and completes figural boundaries, and suppresses image noise. This boundary segmentation is rendered invariant under 2-D translation, rotation, and dilation by use of a log-polar transform. The invariant spectra undergo Gaassian coarse coding to further reduce noise and 3-D foreshortening effects, and to increase generalization. These compressed codes are input into the
Neural dynamics of variable-rate speech categorization
- J. Exp. Psych. Hum. Perception Performance
, 1997
"... What is the neural representation of a speech code as it evolves in time? A neural model simulates data concerning segregation and integration of phonetic percepts. Hearing two phonetically related stops in a VC-CV pair (V = vowel; C = consonant) requires 150 ms more closure time than hearing two ph ..."
Abstract
-
Cited by 46 (23 self)
- Add to MetaCart
What is the neural representation of a speech code as it evolves in time? A neural model simulates data concerning segregation and integration of phonetic percepts. Hearing two phonetically related stops in a VC-CV pair (V = vowel; C = consonant) requires 150 ms more closure time than hearing two phonetically different stops in a VC,-C2V pair. Closure time also varies with long-term stimulus rate. The model simulates rate-dependent category boundaries that emerge from feedback: interactions between a working memory for short-term storage of phonetic items and a list categorization network for grouping sequences of items. The conscious speech code is a resonant wave. It emerges after bottom-up signals from the working memory select list chunks which read out top-down expectations that amplify and focus attention on consistent working memory items. In VCi-C2V pairs, resonance is reset by mismatch of Cj with the C, expectation. In VC-CV pairs, resonance prolongs a repeated C. What is the nature of the process that converts brain events into behavioral percepts? An answer to this question is needed in order to understand how the brain controls behavior and how the brain is, in turn, shaped by environmental feedback that is experienced on the behavioral level. The nature of this connection also needs to be understood in order to develop neurally plausible connectionist models. Without it, a correct linking hypothesis cannot be developed between psychological data and the brain mechanisms from which they are generated.
The Hippocampus And Cerebellum In Adaptively Timed Learning, Recognition, And Movement
, 1995
"... The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors ..."
Abstract
-
Cited by 45 (26 self)
- Add to MetaCart
The concepts of declarative memory and procedural memory have been used to distinguish two basic types of learning. A neural network model suggests how such memory processes work together as recognition learning, reinforcement learning, and sensory-motor learning take place during adaptive behaviors. To coordinate these processes, the hippocampal formation and cerebellum each contain circuits that learn to adaptively time their outputs. Within the model, hippocampal timing helps to maintain attention on motivationally salient goal objects during variable task-related delays, and cerebellar timing controls the release of conditioned responses. This property is part of the model's description of how cognitive-emotional interactions focus attention on motivationally valued cues, and how this process breaks down due to hippocampal ablation. The model suggests that the hippocampal mechanisms that help to rapidly draw attention to salient cues could prematurely release motor commands were no...
A Taxonomy for Spatiotemporal Connectionist Networks Revisited: The Unsupervised Case
- Neural Computation
, 2003
"... Spatiotemporal connectionist networks (STCN's) comprise an important class of neural models that can deal with patterns distributed both in time and space. In this paper, we widen the application domain of the taxonomy for supervised STCN's recently proposed by Kremer (2001) to the unsupervised case ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
Spatiotemporal connectionist networks (STCN's) comprise an important class of neural models that can deal with patterns distributed both in time and space. In this paper, we widen the application domain of the taxonomy for supervised STCN's recently proposed by Kremer (2001) to the unsupervised case. This is possible through a reinterpretation of the state vector as a vector of latent (hidden) variables, as proposed by Meinicke (2000). The goal of this generalized taxonomy is then to provide a nonlinear generative framework for describing unsupervised spatiotemporal networks, making it easier to compare and contrast their representational and operational characteristics. Computational properties, representational issues and learning are also discussed and a number of references to the relevant source publications are provided. It is argued that the proposed approach is simple and more powerful than the previous attempts, from a descriptive and predictive viewpoint. We also discuss the relation of this taxonomy with automata theory and state space modeling, and suggest directions for further work.
Resonant Neural Dynamics Of Speech Perception
, 2003
"... What is the neural representation of a speech code as it evolves in time? How do listeners integrate temporally distributed phonemic information across hundreds of milliseconds, even backwards in time, into coherent representations of syllables and words? What sorts of brain mechanisms encode the co ..."
Abstract
-
Cited by 20 (4 self)
- Add to MetaCart
What is the neural representation of a speech code as it evolves in time? How do listeners integrate temporally distributed phonemic information across hundreds of milliseconds, even backwards in time, into coherent representations of syllables and words? What sorts of brain mechanisms encode the correct temporal order, despite such backwards effects, during speech perception? How does the brain extract rate- invariant properties of variable-rate speech? This article describes an emerging neural model that suggests answers to these questions, while quantitatively simulating challenging data about audition, speech and word recognition. This model includes bottom-up filtering, horizontal competitive, and top-down attentional interactions between a working memory for short-term storage of phonetic items and a list categorization network for grouping sequences of items. The conscious speech and word recognition code is suggested to be a resonant wave of activation across such a network, and a percept of silence is proposed to be a temporal discontinuity in the rate with which such a resonant wave evolves. Properties of these resonant waves can be traced to the brain mechanisms whereby auditory, speech, and language representations are learned in a stable way through time. Because resonances are proposed to control stable learning, the model is called an Adaptive Resonance Theory, or ART, model.
Neural Dynamics Of Perceptual Order And Context Effects For Variable-Rate Speech Syllables
, 1998
"... How does the brain extract invariant properties of variable-rate speech? A neural model, called PHONET, is developed to explain aspects of this process and, along the way, data about perceptual context effects. For example, in consonant vowel (CV) syllables such as /ba/ and /wa/, an increase in the ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
How does the brain extract invariant properties of variable-rate speech? A neural model, called PHONET, is developed to explain aspects of this process and, along the way, data about perceptual context effects. For example, in consonant vowel (CV) syllables such as /ba/ and /wa/, an increase in the duration of the vowel can cause a switch in the percept of the preceding consonant from /w/ to /b/ (Miller and Liberman, 1979). The frequency extent of the initial formant transitions of fixed duration also influences the percept (Schwab, Sawusch, and Nusbaum, 1981). PHONET quantitatively simulates over 98% of the variance in these data using a single set of parameters. The model also qualitatively explains many data about other perceptual context effects. In the model, C and V inputs are filtered by parallel auditory streams that respond preferentially to transient and sustained properties of the acoustic signal before being stored in parallel working memories. A lateral inhibitory network ...
Integrating symbolic and neural processing in a self-organizing architecture for pattern recognition and prediction
- In: Artificial Intelligence and Neural Networks: Steps Toward Principled
, 1994
"... otherwise, or to republish, requires a fee and/or special permission. Copyright @ 1993 ..."
Abstract
-
Cited by 10 (5 self)
- Add to MetaCart
otherwise, or to republish, requires a fee and/or special permission. Copyright @ 1993
Combining Multiple Views and Temporal Associations for 3-D Object Recognition
- In Proc. ECCV’98
, 1998
"... This article describes an architecture for the recognition of three-dimensional objects on the basis of viewer centred representations and temporal associations. Considering evidence from psychophysics, neurophysiology, as well as computer science we have decided to use a viewer centred approach for ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
This article describes an architecture for the recognition of three-dimensional objects on the basis of viewer centred representations and temporal associations. Considering evidence from psychophysics, neurophysiology, as well as computer science we have decided to use a viewer centred approach for the representation of three-dimensional objects. Even though this concept quite naturally suggests utilizing the temporal order of the views for learning and recognition, this aspect is often neglected. Therefore we will pay special attention to the evaluation of the temporal information and embed it into the conceptual framework of biological findings and computational advantages. The proposed recognition system consists of four stages and includes dioeerent kinds of artificial neural networks: Preprocessing is done by a Gabor-based wavelet transform. A Dynamic Link Matching algorithm, extended by several modiøcations, forms the second stage. It implements recognition and learning of the v...
Unsupervised multimodal neural networks
, 2006
"... We extend the in-situ Hebbian-linked SOMs network by Miikkulainen to come up with two unsupervised neural networks that learn the mapping between the individual modes of a multimodal dataset. The first network, the single-pass Hebbian linked SOMs network, extends the in-situ Hebbian-linked SOMs netw ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
We extend the in-situ Hebbian-linked SOMs network by Miikkulainen to come up with two unsupervised neural networks that learn the mapping between the individual modes of a multimodal dataset. The first network, the single-pass Hebbian linked SOMs network, extends the in-situ Hebbian-linked SOMs network by enabling the Hebbian link weights to be computed through one-shot learning. The second network, a modified counterpropagation network, extends the unsupervised learning of crossmodal mappings by making it possible for only one self-organising map to implement the crossmodal mapping. The two proposed networks each have a smaller computation time and achieve lower crossmodal mean squared errors than the in-situ Hebbian-linked SOMs network when assessed on two bimodal datasets, an audio-acoustic speech utterance dataset and a phonological-semantics child utterance dataset. Of the three network architectures, the modified counterpropagation network achieves the highest percentage of correct classifications comparable to that of the LVQ-2 algorithm by Kohonen and the neural network for category learning by de Sa and Ballard in classification tasks using the audio-acoustic speech utterance dataset.

