Results 1  10
of
11
Recognition without Correspondence using Multidimensional Receptive Field Histograms
 International Journal of Computer Vision
, 2000
"... . The appearance of an object is composed of local structure. This local structure can be described and characterized by a vector of local features measured by local operators such as Gaussian derivatives or Gabor filters. This article presents a technique where appearances of objects are represente ..."
Abstract

Cited by 209 (19 self)
 Add to MetaCart
. The appearance of an object is composed of local structure. This local structure can be described and characterized by a vector of local features measured by local operators such as Gaussian derivatives or Gabor filters. This article presents a technique where appearances of objects are represented by the joint statistics of such local neighborhood operators. As such, this represents a new class of appearance based techniques for computer vision. Based on joint statistics, the paper develops techniques for the identification of multiple objects at arbitrary positions and orientations in a cluttered scene. Experiments show that these techniques can identify over 100 objects in the presence of major occlusions. Most remarkably, the techniques have low complexity and therefore run in realtime. 1. Introduction The paper proposes a framework for the statistical representation of the appearance of arbitrary 3D objects. This representation consists of a probability density function or jo...
A Tensor Framework for Multidimensional Signal Processing
 Linkoping University, Sweden
, 1994
"... ii About the cover The figure on the cover shows a visualization of a symmetric tensor in three dimensions, G = λ1ê1ê T 1 + λ2ê2ê T 2 + λ3ê3ê T 3 The object in the figure is the sum of a spear, a plate and a sphere. The spear describes the principal direction of the tensor λ1ê1ê T 1, where the lengt ..."
Abstract

Cited by 53 (8 self)
 Add to MetaCart
ii About the cover The figure on the cover shows a visualization of a symmetric tensor in three dimensions, G = λ1ê1ê T 1 + λ2ê2ê T 2 + λ3ê3ê T 3 The object in the figure is the sum of a spear, a plate and a sphere. The spear describes the principal direction of the tensor λ1ê1ê T 1, where the length is proportional to the largest eigenvalue, λ1. The plate describes the plane spanned by the eigenvectors corresponding to the two largest eigenvalues, λ2(ê1ê T 1 + ê2ê T 2). The sphere, with a radius proportional to the smallest eigenvalue, shows how isotropic the tensor is, λ3(ê1ê T 1 + ê2ê T 2 + ê3ê T 3). The visualization is done using AVS [WWW94]. I am very grateful to Johan Wiklund for implementing the tensor viewer module used. This thesis deals with filtering of multidimensional signals. A large part of the thesis is devoted to a novel filtering method termed “Normalized convolution”. The method performs local expansion of a signal in a chosen filter basis which
Adaptive Multidimensional Filtering
 LINKÖPING UNIVERSITY, SWEDEN
, 1992
"... This thesis contains a presentation and an analysis of adaptive filtering strategies for multidimensional data. The size, shape and orientation of the filter are signal controlled and thus adapted locally to each neighbourhood according to a predefined model. The filter is constructed as a linear we ..."
Abstract

Cited by 30 (0 self)
 Add to MetaCart
This thesis contains a presentation and an analysis of adaptive filtering strategies for multidimensional data. The size, shape and orientation of the filter are signal controlled and thus adapted locally to each neighbourhood according to a predefined model. The filter is constructed as a linear weighting of fixed oriented bandpass filters having the same shape but different orientations. The adaptive filtering methods have been tested on both real data and synthesized test data in 2D, e.g. still images, 3D, e.g. image sequences or volumes, with good results. In 4D, e.g. volume sequences, the algorithm is given in its mathematical form. The weighting coefficients are given by the inner products of a tensor representing the local structure of the data and the tensors representing the orientation of the filters. The procedure and filter design in estimating the representation tensor are described. In 2D, the tensor contains information about the local energy, the optimal orientation and a certainty of the orientation. In 3D, the information in the tensor is the energy, the normal to the best fitting local plane and the tangent to the best fitting line, and certainties of these orientations. In the case of time sequences, a quantitative comparison of the proposed method and other (optical flow) algorithms is presented. The estimation of control information is made in different scales. There are two main reasons for this. A single filter has a particular limited pass band which may or may not be tuned to the different sized objects to describe. Second, size or scale is a descriptive feature in its own right. All of this requires the integration of measurements from different scales. The increasing interest in wavelet theory supports the idea that a multiresolution approach is necessary. Hence the resulting adaptive filter will adapt also in size and to different orientations in different scales.
Disparity Selection in Binocular Pursuit
 In Proc. 4th IAPR Workshop on MVA
, 1994
"... This paper presents a technique for disparity selection in the context of binocular pursuit. For vergence control in binocular pursuit, it is a crucial problem to find the disparity which corresponds to the target among multiple disparities generally observed in a scene. To solve the problem of the ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
This paper presents a technique for disparity selection in the context of binocular pursuit. For vergence control in binocular pursuit, it is a crucial problem to find the disparity which corresponds to the target among multiple disparities generally observed in a scene. To solve the problem of the selection, we propose an approach based on histogramming the disparities obtained in the scene. Here we use an extended phasebased disparity estimation algorithm. The idea is to slice the scene using the disparity histogram so that only the target remains. The slice is chosen around a peak in the histogram using prediction of the target disparity and target location obtained by back projection. The tracking of the peak enables robustness against other, possibly dominant, objects in the scene. The approach is investigated through experiments and shown to work appropriately. Keywords: phasebased, disparity, vergence, disparity selection, back projection, binocular pursuit
Local Fourier Phase and Disparity Estimates: An Analytical Study
 In Proc. IAPR 6th International Conf. CAIP
, 1995
"... . This paper concerns local Fourier phase recovery in the context of stereo disparity localization. In occasions where depth information is required, whether absolutely or relatively, a highly desirable property is the availability of disparity estimations tightly connected to the spatial locati ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
. This paper concerns local Fourier phase recovery in the context of stereo disparity localization. In occasions where depth information is required, whether absolutely or relatively, a highly desirable property is the availability of disparity estimations tightly connected to the spatial locations. Algorithms using Fourier phase provide a new promising method in this respect, based on the output of bandpass filters such as Gabor filters. As is the case for the other alternative algorithms, however, the estimation depends on the input form as well as the filter form. In this article, we address this aspect of the phasebased technique and provide an analytical study on the approximation accuracy of the local phase recovery which determines the efficiency of the whole scheme. Based on uncertainty analysis of the filter bandwidth, effective designs of bandpass filters are pointed out in terms of both disparity localization and the estimation accuracy and their behavior are s...
A Fast Foveated Stereo Matcher
, 2000
"... A stereomatcher is required to form part of an active vision system currently under construction. Previous approaches to accomplishing the task of computing binocular disparity quickly have included the use of dedicated hardware and a tradeoff of speed against pixel resolution. A softwarebased ste ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
A stereomatcher is required to form part of an active vision system currently under construction. Previous approaches to accomplishing the task of computing binocular disparity quickly have included the use of dedicated hardware and a tradeoff of speed against pixel resolution. A softwarebased stereomatcher is reported which produces results quickly whilst retaining subpixel accuracy. The process of foveation is applied to a scalespace matching algorithm to produce disparity maps of spatially nonlinear resolution. Results are presented evaluating the accuracy of the algorithm and also runtime performance of a Javabased implementation executing on standard hardware.
students and to be distributed in an electronic format.
, 2007
"... Fattah i This paper presents the results of an investigation and pilot study into an active, binocular vision system driven by the SIFT algorithm. The study demonstrates a method for combining binocular vergence, object recognition and attention control in a unified framework made computationally ..."
Abstract
 Add to MetaCart
Fattah i This paper presents the results of an investigation and pilot study into an active, binocular vision system driven by the SIFT algorithm. The study demonstrates a method for combining binocular vergence, object recognition and attention control in a unified framework made computationally parsimonious by the use of SIFT features as the basis of all functionality. The prototype developed is capable of identifying, targeting, verging on and recognising objects in a highlycluttered scene without the need for calibration or other knowledge of the camera geometry. This is achieved by implementing all image analysis in purelysymbolic space without creating explicit pixelspace maps. The system structure is based on the ‘searchlight metaphor ’ of biological systems. Results show a highlevel of accuracy and reliability in all system functions and a powerful ability to explore complex scenes.
Vision Algorithms and Optical Computer Architectures
"... he image plane and the line containing the 3D point and the principal point. See gure 1. The process giving the pixel corresponding to a 3D point M can be explained in the following manner: Change the coordinate origin: the new origin is the principal point P Change axis: the new axis are paral ..."
Abstract
 Add to MetaCart
he image plane and the line containing the 3D point and the principal point. See gure 1. The process giving the pixel corresponding to a 3D point M can be explained in the following manner: Change the coordinate origin: the new origin is the principal point P Change axis: the new axis are parallel to the rows and columns of the image Do the projection onto this image plane Change the image origin: the new origin is generally the upper left corner of the image (the old one was the orthogonal projection of the principal point onto the image plane) Scale to transform the 3D units (centimetre, meter or kilometre) into pixel unit. So this model is dened by several parameters: the 3 coordinates of the principal point 2 P M Figure 1: The pinhole model the 3 angles of the axis rotation the 2 image coordinates of the proje
Object Recognition using
, 1996
"... This paper presentsatechnique todeterminetheidentityof objects in a sceneusinghistograms of the responses of a vector of local linear neighborhood operators (receptive fields). This technique can be used todeterminethe most probable objects in a scene, independentof the object's position, image ..."
Abstract
 Add to MetaCart
This paper presentsatechnique todeterminetheidentityof objects in a sceneusinghistograms of the responses of a vector of local linear neighborhood operators (receptive fields). This technique can be used todeterminethe most probable objects in a scene, independentof the object's position, imageplane orientation and scale. In this paper we describe themathematical foundations of thetechnique and presentthe results of experiments which compare robustness and recognition rates for different local neighborhood operators andhistogram similaritymeasurements.
COMPARING SOME TOOLS USING FREQUENCY DOMAIN FOR THE ESTIMATION OF 1D ( AND 2D) DISPARITY
"... In stereovision, the use of frequency domain methods raises the problems of the choice of a family of filters and a strategy to use it. This paper compares three families: Gabor filters, finite prolate spheroidal sequences and Weng's Windowed Fourier of Gaussian (WFG). Then a comparison between two ..."
Abstract
 Add to MetaCart
In stereovision, the use of frequency domain methods raises the problems of the choice of a family of filters and a strategy to use it. This paper compares three families: Gabor filters, finite prolate spheroidal sequences and Weng's Windowed Fourier of Gaussian (WFG). Then a comparison between two strategies is done: an imagebased choice of filters and the use of a "complete " subset of filters.