Results 1 - 10
of
13
An Active Vision Architecture based on Iconic Representations
- Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract
-
Cited by 116 (12 self)
- Add to MetaCart
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of high-dimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Object indexing using an iconic sparse distributed memory
, 1995
"... A general-purpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of high-dimensional spaces to achieve high precision recognition. An object is represented by a set of high-dimensional iconic feature vectors com ..."
Abstract
-
Cited by 57 (8 self)
- Add to MetaCart
A general-purpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of high-dimensional spaces to achieve high precision recognition. An object is represented by a set of high-dimensional iconic feature vectors comprised of the responses of derivative of Gaussian filters at a range of orientations and scales. Since these filters can be shown to form the eigenvectors of arbitrary images containing both natural and man-made structures, they are well-suited for indexing in disparate domains. The indexing algorithm uses an active vision system in conjunction with a modified form of Kanerva’s sparse distributed memory which facilitates interpolation between views and provides a convenient platform for learning the association between an object’s appearance and its identity. The robustness of the indexing method was experimentally confirmed by subjecting the method to a range of viewing conditions and the accuracy was verified using a well-known model database containing a number of complex 3D objects under varying pose. 1
Issues in Vision Modeling for Perceptual Video Quality Assessment
, 1999
"... Lossy compression algorithms used in digital video systems produce artifacts whose visibility strongly depends on the actual image content. Simple error measures such as RMSE or PSNR, albeit popular, ignore this important fact and are only a mediocre predictor of perceived quality. Many applications ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
Lossy compression algorithms used in digital video systems produce artifacts whose visibility strongly depends on the actual image content. Simple error measures such as RMSE or PSNR, albeit popular, ignore this important fact and are only a mediocre predictor of perceived quality. Many applications require more reliable assessment methods. This paper discusses issues in vision modeling for perceptual video quality assessment (PVQA). Its purpose is not to describe a particular model or system, but rather to summarize and to provide pointers to up-to-date knowledge of important characteristics of the human visual system, to explain how these characteristics may be incorporated in vision models for PVQA, to give a brief overview of the state-of-the-art and current efforts in this field, and to outline directions for future research.
Physiological Computation of Binocular Disparity
, 1997
"... We previously proposed a physiologically realistic model for stereo vision based on the quantitative binocular receptive field profiles mapped by Freeman and coworkers. Here we present several new results about the model that shed light on the physiological processes involved in disparity computatio ..."
Abstract
-
Cited by 33 (10 self)
- Add to MetaCart
We previously proposed a physiologically realistic model for stereo vision based on the quantitative binocular receptive field profiles mapped by Freeman and coworkers. Here we present several new results about the model that shed light on the physiological processes involved in disparity computation. First, we show that our model can be extended to a much more general class of receptive field profiles than the commonly used Gabor functions. Second, we demonstrate that there is, however, an advantage of using the Gabor filters: Similar to our perception, the stereo algorithm with the Gabor filters has a small bias towards zero disparity. Third, we prove that the complex cells as described by Freeman et al. compute disparity by effectively summing up two related cross products between the band-pass filtered left and right retinal image patches. This operation is related to cross-correlation but it overcomes some major problems with the standard correlator. Fourth, we demonstrate that as...
Perceptually Modulated Level of Detail for Virtual Environments
, 1997
"... This thesis presents a generic and principled solution for optimising the visual complexity of any arbitrary computer-generated virtual environment (VE). This is performed with the ultimate goal of reducing the inherent latencies of current virtual reality (VR) technology. Effectively, we wish to re ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
This thesis presents a generic and principled solution for optimising the visual complexity of any arbitrary computer-generated virtual environment (VE). This is performed with the ultimate goal of reducing the inherent latencies of current virtual reality (VR) technology. Effectively, we wish to remove extraneous detail from an environment which the user cannot perceive, and thus modulate the graphical complexity of a VE with little or no perceptual artifacts. The work proceeds by investigating contemporary models and theories of visual perception and then applying these to the field of real-time computer graphics. Subsequently, a technique is devised to assess the perceptual content of a computer-generated image in terms of spatial frequency (c/deg), and a model of contrast sensitivity is formulated to describe a user's ability to perceive detail under various conditions in terms of this metric. This allows us to base the level of detail (LOD) of each object in a VE on a measure of ...
Wavelets, vision and the statistics of natural scenes
- 71 Academy of Science, Engineering and Technology 26 2007
, 1999
"... The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding s ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
The processing of spatial information by the visual system shows a number of similarities to the wavelet transforms that have become popular in applied mathematics. Over the last decade, a range of studies has focused on the question of ‘why ’ the visual system would evolve this strategy of coding spatial information. One such approach has focused on the relationship between the visual code and the statistics of natural scenes under the assumption that the visual system has evolved this strategy as a means of optimizing the representation of its visual environment. This paper reviews some of this literature and looks at some of the statistical properties of natural scenes that allow this code to be efficient. It is argued that such wavelet codes are efficient because they increase the independence of the vectors ’ outputs (i.e. they increase the independence of the responses of the visual neurons) by finding the sparse structure available in the input. Studies with neural networks that attempt to maximize the ‘sparsity ’ of the representation have been shown to produce vectors (neural receptive fields) that have many of the properties of a wavelet representation. It is argued that the visual environment has the appropriate sparse structure to make this sparse output possible. It is argued that these sparse/independent representations make it computationally easier to detect and represent the higher-order structure present in complex environmental data.
Lateral Inhibition in Cortical Filters
- PROC. OF INT. CONF. ON DIGITAL SIGNAL PROCESSING AND INT. CONF. ON COMPUTER APPLICATIONS TO ENGINEERING SYSTEMS
, 1993
"... This work presents explorations in the microstructure of natural vision systems based on large scale computer simulations. Similarly to previous work in this area, we compute the functional inner products of a two-dimensional input signal (image) with a set of two-dimensional Gabor functions whic ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
This work presents explorations in the microstructure of natural vision systems based on large scale computer simulations. Similarly to previous work in this area, we compute the functional inner products of a two-dimensional input signal (image) with a set of two-dimensional Gabor functions which have been shown to fit the receptive fields of simple cells in the primary visual cortex of mammals. These inner products are then considered as net inputs to the cortical cells and used to compute the cell activations as non-linear functions. A previously used model is extended with a pixel-wise winner-takes-all competition between different Gabor filters which is introduced in order to model lateral inhibition between cortical cells. The effect of lateral inhibition is qualitatively estimated by visualization of computed cortical images and quantitatively evaluated by applying the model to a face recognition problem. Recognition rate of 97% was achieved on a database of 205 face i...
The Gaussian Derivative model for spatial-temporal vision
- I. Cortical Model. Spatial Vision
, 2001
"... Abstract—Receptive � elds of simple cells in the primate visual cortex were well � t in the space and time domains by the Gaussian Derivative (GD) model for spatio-temporal vision. All 23 � elds in the data sample could be � t by one equation, varying only a single shape number and nine geometric tr ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Abstract—Receptive � elds of simple cells in the primate visual cortex were well � t in the space and time domains by the Gaussian Derivative (GD) model for spatio-temporal vision. All 23 � elds in the data sample could be � t by one equation, varying only a single shape number and nine geometric transformation parameters. A difference-of-offset-Gaussians (DOOG) mechanism for the GD model also � t the data well. Other models tested did not � t the data as well as or as succinctly, or failed to converge on a unique solution, indicatingover-parameterization.An ef � cient computationalalgorithm was found for the GD model which produced robust estimates of the direction and speed of moving objects in real scenes. 1.
Interacting Cortical Filters for Object Recognition
- Proceedings of Asian Conference on Computer Vision, ACCV '93
, 1993
"... It is shown how cortical filters can be used for image analysis and object recognition. Similarly to previous work in this area, we compute functional inner products of a twodimensional input signal (image) with a set of two-dimensional Gabor functions which fit the receptive fields of simple cells ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
It is shown how cortical filters can be used for image analysis and object recognition. Similarly to previous work in this area, we compute functional inner products of a twodimensional input signal (image) with a set of two-dimensional Gabor functions which fit the receptive fields of simple cells in the primary visual cortex of mammals. We propose a method in which these inner products become the subject of thresholding, orientation competition and lateral inhibition. Each of the resulting cortical images contains only edge lines of a particular orientation and a particular light-to-dark transition direction. In this way, the information which is present in the original image is split in different channels and we show how this splitting can be used for object recognition. The method discriminates between simple geometrical figures, e.g. polygons with different numbers of edges, with reliability of 100% and a recognition rate of 99% has been achieved when the method was applied to a l...
Probabilistic Models of Early Vision
, 2002
"... How do our brains transform patterns of light striking the retina into useful knowledge about objects and events of the external world? Thanks to intense research into the mechanisms of vision, much is now known about this process. However, we do not yet have anything close to a complete picture, an ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
How do our brains transform patterns of light striking the retina into useful knowledge about objects and events of the external world? Thanks to intense research into the mechanisms of vision, much is now known about this process. However, we do not yet have anything close to a complete picture, and many questions remain unanswered. In addition to its clinical relevance and purely academic significance, research on vision is important because a thorough understanding of biological vision would probably help solve many major problems in computer vision.

