Results 1 - 10
of
30
Feature detection with automatic scale selection
- International Journal of Computer Vision
, 1998
"... The fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works ..."
Abstract
-
Cited by 349 (25 self)
- Add to MetaCart
The fact that objects in the world appear in different ways depending on the scale of observation has important implications if one aims at describing them. It shows that the notion of scale is of utmost importance when processing unknown measurement data by automatic methods. In their seminal works, Witkin (1983) and Koenderink (1984) proposed to approach this problem by representing image structures at different scales in a so-called scale-space representation. Traditional scale-space theory building on this work, however, does not address the problem of how to select local appropriate scales for further analysis. This article proposes a systematic methodology for dealing with this problem. A framework is proposed for generating hypotheses about interesting scale levels in image data, based on a general principle stating that local extrema over scales of different combinations of γ-normalized derivatives are likely candidates to correspond to interesting structures. Specifically, it is shown how this idea can be used as a major mechanism in algorithms for automatic scale selection, which
Preattentive texture discrimination with early vision mechanisms
- Journal of the Optical Society of America A
, 1990
"... mechanisms ..."
An Active Vision Architecture based on Iconic Representations
- Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract
-
Cited by 116 (12 self)
- Add to MetaCart
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of high-dimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Robust computation of optic flow in a multiscale differential framework
- International Journal of Computer Vision
, 1995
"... Abstract. We have developed a new algorithm for computing optical flow in a differential framework. The image sequence is first convolved with a set of linear, separable spatiotemporal filter kernels similar to those that have been used in other early vision problems such as texture and stereopsis. ..."
Abstract
-
Cited by 83 (2 self)
- Add to MetaCart
Abstract. We have developed a new algorithm for computing optical flow in a differential framework. The image sequence is first convolved with a set of linear, separable spatiotemporal filter kernels similar to those that have been used in other early vision problems such as texture and stereopsis. The brightness constancy constraint can then be applied to each of the resulting images, giving us, in general, an overdetermined system of equations for the optical flow at each pixel. There are three principal sources of error: (a) stochastic error due to sensor noise (b) systematic errors in the presence of large displacements and (c) errors due to failure of the brightness constancy model. Our analysis of these errors leads us to develop an algorithm based on a robust version of total least squares. Each optical flow vector computed has an associated reliability measure which can be used in subsequent processing. The performance of the algorithm on the data set used by Barron et al. (IJCV 1994) compares favorably with other techniques. In addition to being separable, the filters used are also causal, incorporating only past time frames. The algorithm is fully parallel and has been implemented on a multiple processor machine. 1
Dynamic Model of Visual Recognition Predicts Neural Response Properties in the Visual Cortex
- Neural Computation
, 1995
"... this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expec ..."
Abstract
-
Cited by 77 (20 self)
- Add to MetaCart
this paper, we describe a hierarchical network model of visual recognition that explains these experimental observations by using a form of the extended Kalman filter as given by the Minimum Description Length (MDL) principle. The model dynamically combines input-driven bottom-up signals with expectation-driven top-down signals to predict current recognition state. Synaptic weights in the model are adapted in a Hebbian manner according to a learning rule also derived from the MDL principle. The resulting prediction/learning scheme can be viewed as implementing a form of the Expectation-Maximization (EM) algorithm. The architecture of the model posits an active computational role for the reciprocal connections between adjoining visual cortical areas in determining neural response properties. In particular, the model demonstrates the possible role of feedback from higher cortical areas in mediating neurophysiological effects due to stimuli from beyond the classical receptive field. Si
Object indexing using an iconic sparse distributed memory
, 1995
"... A general-purpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of high-dimensional spaces to achieve high precision recognition. An object is represented by a set of high-dimensional iconic feature vectors com ..."
Abstract
-
Cited by 57 (8 self)
- Add to MetaCart
A general-purpose object indexing technique is described that combines the virtues of principal component analysis with the favorable matching properties of high-dimensional spaces to achieve high precision recognition. An object is represented by a set of high-dimensional iconic feature vectors comprised of the responses of derivative of Gaussian filters at a range of orientations and scales. Since these filters can be shown to form the eigenvectors of arbitrary images containing both natural and man-made structures, they are well-suited for indexing in disparate domains. The indexing algorithm uses an active vision system in conjunction with a modified form of Kanerva’s sparse distributed memory which facilitates interpolation between views and provides a convenient platform for learning the association between an object’s appearance and its identity. The robustness of the indexing method was experimentally confirmed by subjecting the method to a range of viewing conditions and the accuracy was verified using a well-known model database containing a number of complex 3D objects under varying pose. 1
Logical/Linear Operators for Image Curves
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1995
"... We propose a language for designing image measurement operators suitable for early vision. We refer to them as logical/linear (L/L) operators, since they unify aspects of linear operator theory and boolean logic. A family of these operators appropriate for measuring the low-order differential struct ..."
Abstract
-
Cited by 37 (7 self)
- Add to MetaCart
We propose a language for designing image measurement operators suitable for early vision. We refer to them as logical/linear (L/L) operators, since they unify aspects of linear operator theory and boolean logic. A family of these operators appropriate for measuring the low-order differential structure of image curves is developed. These L/L operators are derived by decomposing a linear model into logical components to ensure that certain structural preconditions for the existence of an image curve are upheld. Tangential conditions guarantee continuity, while normal conditions select and categorize contrast profiles. The resulting operators allow for coarse measurement of curvilinear differential structure (orientation and curvature) while successfully segregating edge- and line-like features. By thus reducing the incidence of false-positive responses, these operators are a substantial improvement over (thresholded) linear operators which attempt to resolve the same class of features. ...
Natural basis functions and topographic memory for face recognition
- In Proc. of IJCAI
, 1995
"... Recent work regarding the statistics of natural images has revealed that the dominant eigenvectors of arbitrary natural images closely approximate various oriented derivative-of-Gaussian functions; these functions have also been shown to provide the best fit to the receptive field profiles of cells ..."
Abstract
-
Cited by 36 (4 self)
- Add to MetaCart
Recent work regarding the statistics of natural images has revealed that the dominant eigenvectors of arbitrary natural images closely approximate various oriented derivative-of-Gaussian functions; these functions have also been shown to provide the best fit to the receptive field profiles of cells in the primate striate cortex. We propose a scheme for expressioninvariant face recognition that employs a fixed set of these "natural " basis functions to generate multiscale iconic representations of human faces. Using a fixed set of basis functions obviates the need for recomputing eigenvectors (a step that was necessary in some previous approaches employing principal component analysis (PCA) for recognition) while at the same time retaining the redundancy-reducing properties of PCA. A face is represented by a set of iconic representations automatically extracted from an input image. The description thus obtained is stored in a topographically-organized sparse distributed memory that is based on a model of human long-term memory first proposed by Kanerva. We describe experimental results for an implementation of the method on a pipeline image processor that is capable of achieving near real-time recognition by exploiting the processor's frame-rate convolution capability for indexing purposes. 1
Color and scale: The spatial structure of color images
- Sixth Europian Conference on Computer Vision (ECCV
, 2000
"... accepted by ECCV2000 Abstract. For grey-value images, it is well accepted that the neighborhood rather than the pixel carries the geometrical interpretation. Interestingly the spatial configuration of the neighborhood is the basis for the perception of humans. Common practise in color image processi ..."
Abstract
-
Cited by 23 (11 self)
- Add to MetaCart
accepted by ECCV2000 Abstract. For grey-value images, it is well accepted that the neighborhood rather than the pixel carries the geometrical interpretation. Interestingly the spatial configuration of the neighborhood is the basis for the perception of humans. Common practise in color image processing, is to use the color information without considering the spatial structure. We aim at a physical basis for the local interpretation of color images. We propose a framework for spatial color measurement, based on the Gaussian scale-space theory. We consider a Gaussian color model, which inherently uses the spatial and color information in an integrated model. The framework is well-founded in physics as well as in measurement science. The framework delivers sound and robust spatial color invariant features. The usefulness of the proposed measurement framework is illustrated by edge detection, where edges are discriminated as shadow, highlight, or object boundary. Other applications of the framework include color invariant image retrieval and color constant edge detection. 1
Multiresolution Histograms and their Use for Recognition
- IEEE transactions on Pattern Analysis and Machine Intelligence
, 2004
"... Abstract—The histogram of image intensities is used extensively for recognition and for retrieval of images and video from visual databases. A single image histogram, however, suffers from the inability to encode spatial image variation. An obvious way to extend this feature is to compute the histog ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
Abstract—The histogram of image intensities is used extensively for recognition and for retrieval of images and video from visual databases. A single image histogram, however, suffers from the inability to encode spatial image variation. An obvious way to extend this feature is to compute the histograms of multiple resolutions of an image to form a multiresolution histogram. The multiresolution histogram shares many desirable properties with the plain histogram including that they are both fast to compute, space efficient, invariant to rigid motions, and robust to noise. In addition, the multiresolution histogram directly encodes spatial information. We describe a simple yet novel matching algorithm based on the multiresolution histogram that uses the differences between histograms of consecutive image resolutions. We evaluate it against five widely used image features. We show that with our simple feature we achieve or exceed the performance obtained with more complicated features. Further, we show our algorithm to be the most efficient and robust.

