Results 1  10
of
63
Shiftable Multiscale Transforms
, 1992
"... Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavel ..."
Abstract

Cited by 429 (38 self)
 Add to MetaCart
Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavelet transforms are also unstable with respect to dilations of the input signal, and in two dimensions, rotations of the input signal. We formalize these problems by defining a type of translation invariance that we call "shiftability". In the spatial domain, shiftability corresponds to a lack of aliasing; thus, the conditions under which the property holds are specified by the sampling theorem. Shiftability may also be considered in the context of other domains, particularly orientation and scale. We explore "jointly shiftable" transforms that are simultaneously shiftable in more than one domain. Two examples of jointly shiftable transforms are designed and implemented: a onedimensional tran...
Contour and Texture Analysis for Image Segmentation
, 2001
"... This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the interveni ..."
Abstract

Cited by 302 (29 self)
 Add to MetaCart
This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the intervening contour framework, while texture is analyzed using textons. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on the texturedness of the neighborhood at a pixel. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown.
The steerable pyramid: A flexible architecture for multiscale derivative computation
, 1995
"... We describe an architecture for efficient and accurate linear decomposition of an image into scale and orientation subbands. The basis functions of this decomposition are directional derivative operators of any desired order. We describe the construction and implementation of the transform. 1 Differ ..."
Abstract

Cited by 232 (27 self)
 Add to MetaCart
We describe an architecture for efficient and accurate linear decomposition of an image into scale and orientation subbands. The basis functions of this decomposition are directional derivative operators of any desired order. We describe the construction and implementation of the transform. 1 Differential algorithms are used in a wide variety of image processing problems. For example, gradient measurements are used as a first stage of many edge detection, depthfromstereo, and optical flow algorithms. Higherorder derivatives have also been found useful in these applications. Extraction of these derivative quantities may be viewed as a decomposition of a signal via terms of a local Taylor series expansions [1]. Another widespread tool in signal and image processing is multiscale decomposition. Apart from the advantages of decomposing signals into information at different scales, the typical recursive form of these algorithms leads to large improvements in computational efficiency. Many authors have combined multiscale decompositions with differential measurements (eg., [2, 3]). In these cases, a multiscale pyramid is constructed, and then differential operators (typically, differences of neighboring pixels) are applied to the subbands of the pyramid. Since both the pyramid decomposition and the derivative operation are linear and shiftinvariant, we may combine them into a single operation. The advantages of doing so are that the resulting derivatives may be more accurate (see [4]). In this paper, we propose a simple, efficient decomposition architecture for combining these two operations. The decomposition is the latest incarnation of 1 Source code and filter kernels for implementation of the steerable pyramid are available via anonymous ftp from ftp.cis.upenn.edu:pub/eero/steerpyr.tar.Z
Deformable Kernels for Early Vision
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1991
"... Early vision algorithms often have a first stage of linearfiltering that `extracts' from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsely the spac ..."
Abstract

Cited by 131 (9 self)
 Add to MetaCart
Early vision algorithms often have a first stage of linearfiltering that `extracts' from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsely the space of scales and orientations in order to reduce computation and storage costs. This discretization produces anisotropies due to a loss of traslation, rotation, scalinginvariance that makes early vision algorithms less precise and more difficult to design. This need not be so: one can compute and store efficiently the response of families of linear filters defined on a continuum of orientations and scales. A technique is presented that allows (1) to compute the best approximation of a given family using linear combinations of a small number of `basis' functions; (2) to describe all finitedimensional families, i.e. the families of filters for which a finite dimensional representation is p...
An Active Vision Architecture based on Iconic Representations
 Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract

Cited by 129 (13 self)
 Add to MetaCart
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of highdimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Representing local structure using tensors
 Computer Vision Laboratory, Linkoping University
, 1989
"... The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given. At the heart of the prob ..."
Abstract

Cited by 108 (29 self)
 Add to MetaCart
The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given. At the heart of the problem lies the fact that orientation can only be defined mod 180 ◦ , i.e the fact that a 180 ◦ rotation of a line or a plane amounts to no change at all. For this reason representing a plane using its normal vector leads to ambiguity and such a representation is consequently not suitable. The ambiguity can be eliminated by establishing a mapping between R3 and a higherdimensional tensor space. The uniqueness requirement implies a mapping that map all pairs of 3D vectors x andx onto the same tensor T. Uniformity implies that the mapping implicitly carries a definition of distance between 3D planes (and lines) that is rotation invariant and monotone with the angle between the planes. Polar separability means that the norm of the representing tensor T is rotation invariant. One way to describe the mapping is that it maps a 3D sphere into 6D in such a way that the surface is uni
Television control by hand gestures
 International Workshop on Automatic Face and Gesture Recognition
, 1995
"... We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recog ..."
Abstract

Cited by 87 (3 self)
 Add to MetaCart
We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recognize the commands in a complicated visual environment? We made a prototype of this system using a computer workstation and a television. The graphical overlays appear on the computer screen, although they could be mixed with the video to appear on the television. The computer controls the television set through serial port commands to an electronically controlled remote control. We describe knowledge we gained from building the prototype.
Textons, contours and regions: Cue integration in image segmentation
 In International Conference on Computer Vision
, 1999
"... This paper makes two contributions. It provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in ..."
Abstract

Cited by 87 (9 self)
 Add to MetaCart
This paper makes two contributions. It provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in contour orientation energy and differences in texton densities across the contour. Julesz introduced the term texton, analogous to a phoneme in speech recognition, but did not provide an operational definition for graylevel images. Here we reinvent textons as frequently cooccurring combinations of oriented linear filter outputs. These can be learned using a Kmeans approach. By mapping each pixel to its nearest texton, the image can be analyzed into texton channels, each of which is a point set where discrete techniques such as Voronoi diagrams become applicable. Local histograms of texton frequencies can be used with a � test for significant differences to find texture boundaries. Natural images contain both textured and untextured regions, so we combine this cue with that of the presence of peaks of contour energy derived from outputs of odd and evensymmetric oriented Gaussian derivative filters. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on a statistical test for isotropy of Delaunay neighbors. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown. 1
SteerableScalable Kernels for Edge Detection and Junction Analysis
 Image and Vision Computing
, 1992
"... Families of kernels that are useful in a variety of early vision algorithms may be obtained by rotating and scaling in a continuum a `template' kernel. These multiscale multiorientation family may be approximated by linear interpolation of a discrete finite set of appropriate `basis' kernels. A sc ..."
Abstract

Cited by 80 (1 self)
 Add to MetaCart
Families of kernels that are useful in a variety of early vision algorithms may be obtained by rotating and scaling in a continuum a `template' kernel. These multiscale multiorientation family may be approximated by linear interpolation of a discrete finite set of appropriate `basis' kernels. A scheme for generating such a basis together with the appropriate interpolation weights is described. Unlike previous schemes by Perona, and Simoncelli et al. it is guaranteed to generate the most parsimonious one. Additionally, it is shown how to exploit two symmetries in edgedetection kernels for reducing storage and computational costs and generating simultaneously endstop and junctiontuned filters for free.
Steerable Wedge Filters for Local Orientation Analysis
 IEEE Trans. Image Processing
, 1996
"... Steerable filters have been used to analyze local orientation patterns in imagery. Such filters are typically based on directional derivatives, whose symmetry produces orientation responses that are periodic with period , independent of image structure. We present a more general set of steerable f ..."
Abstract

Cited by 70 (1 self)
 Add to MetaCart
Steerable filters have been used to analyze local orientation patterns in imagery. Such filters are typically based on directional derivatives, whose symmetry produces orientation responses that are periodic with period , independent of image structure. We present a more general set of steerable filters that alleviate this problem.