Results 1  10
of
82
Shiftable Multiscale Transforms
, 1992
"... Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavel ..."
Abstract

Cited by 557 (36 self)
 Add to MetaCart
(Show Context)
Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavelet transforms are also unstable with respect to dilations of the input signal, and in two dimensions, rotations of the input signal. We formalize these problems by defining a type of translation invariance that we call "shiftability". In the spatial domain, shiftability corresponds to a lack of aliasing; thus, the conditions under which the property holds are specified by the sampling theorem. Shiftability may also be considered in the context of other domains, particularly orientation and scale. We explore "jointly shiftable" transforms that are simultaneously shiftable in more than one domain. Two examples of jointly shiftable transforms are designed and implemented: a onedimensional tran...
A Parametric Texture Model based on Joint Statistics of Complex Wavelet Coefficients
 INTERNATIONAL JOURNAL OF COMPUTER VISION
, 2000
"... We present a universal statistical model for texture images in the context of an overcomplete complex wavelet transform. The model is parameterized by a set of statistics computed on pairs of coefficients corresponding to basis functions at adjacent spatial locations, orientations, and scales. We de ..."
Abstract

Cited by 409 (13 self)
 Add to MetaCart
(Show Context)
We present a universal statistical model for texture images in the context of an overcomplete complex wavelet transform. The model is parameterized by a set of statistics computed on pairs of coefficients corresponding to basis functions at adjacent spatial locations, orientations, and scales. We develop an efficient algorithm for synthesizing random images subject to these constraints, by iteratively projecting onto the set of images satisfying each constraint, and we use this to test the perceptual validity of the model. In particular, we demonstrate the necessity of subgroups of the parameter set by showing examples of texture synthesis that fail when those parameters are removed from the set. We also demonstrate the power of our model by successfully synthesizing examples drawn from a diverse collection of artificial and natural textures.
Contour and Texture Analysis for Image Segmentation
, 2001
"... This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the interveni ..."
Abstract

Cited by 406 (29 self)
 Add to MetaCart
(Show Context)
This paper provides an algorithm for partitioning grayscale images into disjoint regions of coherent brightness and texture. Natural images contain both textured and untextured regions, so the cues of contour and texture differences are exploited simultaneously. Contours are treated in the intervening contour framework, while texture is analyzed using textons. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on the texturedness of the neighborhood at a pixel. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown.
The steerable pyramid: A flexible architecture for multiscale derivative computation
 PRESENTED AT: 2ND ANNUAL IEEE INTERNATIONAL CONFERENCE ON IMAGE
, 1995
"... We describe an architecture for efficient and accurate linear decomposition of an image into scale and orientation subbands. The basis functions of this decomposition are directional derivative operators of any desired order. We describe the construction and implementation of the transform. ..."
Abstract

Cited by 331 (30 self)
 Add to MetaCart
We describe an architecture for efficient and accurate linear decomposition of an image into scale and orientation subbands. The basis functions of this decomposition are directional derivative operators of any desired order. We describe the construction and implementation of the transform.
Deformable Kernels for Early Vision
 IEEE Trans. Pattern Anal. Mach. Intell
, 1995
"... AbstractEarly vision algorithms often have a first stage of linearfiltering that ‘extracts ’ from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsel ..."
Abstract

Cited by 145 (11 self)
 Add to MetaCart
(Show Context)
AbstractEarly vision algorithms often have a first stage of linearfiltering that ‘extracts ’ from the image information at multiple scales of resolution and multiple orientations. A common difficulty in the design and implementation of such schemes is that one feels compelled to discretize coarsely the space of scales and orientations in order to reduce computation and storage costs. This discretization produces anisotropies due to a loss of translation, rotation, and scalinginvariance that makes early vision algorithms less precise and more difficult to design. This need not be so: one can compute and store efficiently the response of families of linear filters defined on a continuum of orientations and scales. A technique is presented that allows 1) computing the best approximation of a given family using linear combinations of a small number of ‘basis ’ functions; 2) describing all finitedimensional families, i.e., the families of filters for which a finite dimensional representation is possible with no error. The technique is based on singular value decomposition and may be applied to generating filters in arbitrary dimensions and subject to arbitrary deformations; the relevant functional analysis results are reviewed and precise conditions for the decomposition to be feasible are stated. Experimental results are presented that demonstrate the applicability of the technique to generating multiorientation multiscale 2D edgedetection kernels. The implementation issues are also discussed. Index TermsSteerable filters, wavelets, early vision, multiresolution image analysis, multirate filtering, deformable filters, scalespace I.
An Active Vision Architecture based on Iconic Representations
 Artificial Intelligence
, 1995
"... Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image d ..."
Abstract

Cited by 143 (13 self)
 Add to MetaCart
(Show Context)
Active vision systems have the capability of continuously interacting with the environment. The rapidly changing environment of such systems means that it is attractive to replace static representations with visual routines that compute information on demand. Such routines place a premium on image data structures that are easily computed and used. The purpose of this paper is to propose a general active vision architecture based on efficiently computable iconic representations. This architecture employs two primary visual routines, one for identifying the visual image near the fovea (object identification), and another for locating a stored prototype on the retina (object location). This design allows complex visual behaviors to be obtained by composing these two routines with different parameters. The iconic representations are comprised of highdimensional feature vectors obtained from the responses of an ensemble of Gaussian derivative spatial filters at a number of orientations and...
Representing local structure using tensors
 Computer Vision Laboratory, Linkoping University
, 1989
"... The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given. At the heart of the prob ..."
Abstract

Cited by 142 (34 self)
 Add to MetaCart
The fundamental problem of finding a suitable representation of the orientation of 3D surfaces is considered. A representation is regarded suitable if it meets three basic requirements: Uniqueness, Uniformity and Polar separability. A suitable tensor representation is given. At the heart of the problem lies the fact that orientation can only be defined mod 180 ◦ , i.e the fact that a 180 ◦ rotation of a line or a plane amounts to no change at all. For this reason representing a plane using its normal vector leads to ambiguity and such a representation is consequently not suitable. The ambiguity can be eliminated by establishing a mapping between R3 and a higherdimensional tensor space. The uniqueness requirement implies a mapping that map all pairs of 3D vectors x andx onto the same tensor T. Uniformity implies that the mapping implicitly carries a definition of distance between 3D planes (and lines) that is rotation invariant and monotone with the angle between the planes. Polar separability means that the norm of the representing tensor T is rotation invariant. One way to describe the mapping is that it maps a 3D sphere into 6D in such a way that the surface is uni
Television control by hand gestures
 International Workshop on Automatic Face and Gesture Recognition
, 1995
"... We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recog ..."
Abstract

Cited by 125 (3 self)
 Add to MetaCart
We study how a viewer can control a television set remotely by hand gestures. We address two fundamental issues of gesture–based human–computer interaction: (1) How can one communicate a rich set of commands without extensive user training and memorization of gestures? (2) How can the computer recognize the commands in a complicated visual environment? We made a prototype of this system using a computer workstation and a television. The graphical overlays appear on the computer screen, although they could be mixed with the video to appear on the television. The computer controls the television set through serial port commands to an electronically controlled remote control. We describe knowledge we gained from building the prototype.
Textons, contours and regions: Cue integration in image segmentation
 In International Conference on Computer Vision
, 1999
"... This paper makes two contributions. It provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in ..."
Abstract

Cited by 109 (9 self)
 Add to MetaCart
(Show Context)
This paper makes two contributions. It provides (1) an operational definition of textons, the putative elementary units of texture perception, and (2) an algorithm for partitioning the image into disjoint regions of coherent brightness and texture, where boundaries of regions are defined by peaks in contour orientation energy and differences in texton densities across the contour. Julesz introduced the term texton, analogous to a phoneme in speech recognition, but did not provide an operational definition for graylevel images. Here we reinvent textons as frequently cooccurring combinations of oriented linear filter outputs. These can be learned using a Kmeans approach. By mapping each pixel to its nearest texton, the image can be analyzed into texton channels, each of which is a point set where discrete techniques such as Voronoi diagrams become applicable. Local histograms of texton frequencies can be used with a � test for significant differences to find texture boundaries. Natural images contain both textured and untextured regions, so we combine this cue with that of the presence of peaks of contour energy derived from outputs of odd and evensymmetric oriented Gaussian derivative filters. Each of these cues has a domain of applicability, so to facilitate cue combination we introduce a gating operator based on a statistical test for isotropy of Delaunay neighbors. Having obtained a local measure of how likely two nearby pixels are to belong to the same region, we use the spectral graph theoretic framework of normalized cuts to find partitions of the image into regions of coherent texture and brightness. Experimental results on a wide range of images are shown. 1
SteerableScalable Kernels for Edge Detection and Junction Analysis
 Image and Vision Computing
, 1992
"... Families of kernels that are useful in a variety of early vision algorithms may be obtained by rotating and scaling in a continuum a `template' kernel. These multiscale multiorientation family may be approximated by linear interpolation of a discrete finite set of appropriate `basis' ker ..."
Abstract

Cited by 91 (1 self)
 Add to MetaCart
(Show Context)
Families of kernels that are useful in a variety of early vision algorithms may be obtained by rotating and scaling in a continuum a `template' kernel. These multiscale multiorientation family may be approximated by linear interpolation of a discrete finite set of appropriate `basis' kernels. A scheme for generating such a basis together with the appropriate interpolation weights is described. Unlike previous schemes by Perona, and Simoncelli et al. it is guaranteed to generate the most parsimonious one. Additionally, it is shown how to exploit two symmetries in edgedetection kernels for reducing storage and computational costs and generating simultaneously endstop and junctiontuned filters for free.