Results 1  10
of
654
Atomic decomposition by basis pursuit
 SIAM Journal on Scientific Computing
, 1998
"... Abstract. The timefrequency and timescale communities have recently developed a large number of overcomplete waveform dictionaries — stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several meth ..."
Abstract

Cited by 1660 (43 self)
 Add to MetaCart
Abstract. The timefrequency and timescale communities have recently developed a large number of overcomplete waveform dictionaries — stationary wavelets, wavelet packets, cosine packets, chirplets, and warplets, to name a few. Decomposition into overcomplete systems is not unique, and several methods for decomposition have been proposed, including the method of frames (MOF), Matching pursuit (MP), and, for special dictionaries, the best orthogonal basis (BOB). Basis Pursuit (BP) is a principle for decomposing a signal into an “optimal ” superposition of dictionary elements, where optimal means having the smallest l 1 norm of coefficients among all such decompositions. We give examples exhibiting several advantages over MOF, MP, and BOB, including better sparsity and superresolution. BP has interesting relations to ideas in areas as diverse as illposed problems, in abstract harmonic analysis, total variation denoising, and multiscale edge denoising. BP in highly overcomplete dictionaries leads to largescale optimization problems. With signals of length 8192 and a wavelet packet dictionary, one gets an equivalent linear program of size 8192 by 212,992. Such problems can be attacked successfully only because of recent advances in linear programming by interiorpoint methods. We obtain reasonable success with a primaldual logarithmic barrier method and conjugategradient solver.
A PERFORMANCE EVALUATION OF LOCAL DESCRIPTORS
, 2005
"... In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the HarrisAffine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their perfo ..."
Abstract

Cited by 1157 (38 self)
 Add to MetaCart
In this paper we compare the performance of descriptors computed for local interest regions, as for example extracted by the HarrisAffine detector [32]. Many different descriptors have been proposed in the literature. However, it is unclear which descriptors are more appropriate and how their performance depends on the interest region detector. The descriptors should be distinctive and at the same time robust to changes in viewing conditions as well as to errors of the detector. Our evaluation uses as criterion recall with respect to precision and is carried out for different image transformations. We compare shape context [3], steerable filters [12], PCASIFT [19], differential invariants [20], spin images [21], SIFT [26], complex filters [37], moment invariants [43], and crosscorrelation for different types of interest regions. We also propose an extension of the SIFT descriptor, and show that it outperforms the original method. Furthermore, we observe that the ranking of the descriptors is mostly independent of the interest region detector and that the SIFT based descriptors perform best. Moments and steerable filters show the best performance among the low dimensional descriptors.
Relations between the statistics of natural images and the response properties of cortical cells
 J. Opt. Soc. Am. A
, 1987
"... The relative efficiency of any particular imagecoding scheme should be defined only in relation to the class of images that the code is likely to encounter. To understand the representation of images by the mammalian visual system, it might therefore be useful to consider the statistics of images f ..."
Abstract

Cited by 607 (13 self)
 Add to MetaCart
The relative efficiency of any particular imagecoding scheme should be defined only in relation to the class of images that the code is likely to encounter. To understand the representation of images by the mammalian visual system, it might therefore be useful to consider the statistics of images from the natural environment (i.e., images with trees, rocks, bushes, etc). In this study, various coding schemes are compared in relation to how they represent the information in such natural images. The coefficients of such codes are represented by arrays of mechanisms that respond to local regions of space, spatial frequency, and orientation (Gaborlike transforms). For many classes of image, such codes will not be an efficient means of representing information. However, the results obtained with six natural images suggest that the orientation and the spatialfrequency tuning of mammalian simple cells are well suited for coding the information in such images if the goal of the code is to convert higherorder redundancy (e.g., correlation between the intensities of neighboring pixels) into firstorder redundancy (i.e., the response distribution of the coefficients). Such coding produces a relatively high signaltonoise ratio and permits information to be transmitted with only a subset of the total number of cells. These results support Barlow's theory that the goal of natural vision is to represent the information in the natural environment with minimal redundancy.
Shiftable Multiscale Transforms
, 1992
"... Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavel ..."
Abstract

Cited by 429 (38 self)
 Add to MetaCart
Orthogonal wavelet transforms have recently become a popular representation for multiscale signal and image analysis. One of the major drawbacks of these representations is their lack of translation invariance: the content of wavelet subbands is unstable under translations of the input signal. Wavelet transforms are also unstable with respect to dilations of the input signal, and in two dimensions, rotations of the input signal. We formalize these problems by defining a type of translation invariance that we call "shiftability". In the spatial domain, shiftability corresponds to a lack of aliasing; thus, the conditions under which the property holds are specified by the sampling theorem. Shiftability may also be considered in the context of other domains, particularly orientation and scale. We explore "jointly shiftable" transforms that are simultaneously shiftable in more than one domain. Two examples of jointly shiftable transforms are designed and implemented: a onedimensional tran...
A metric for distributions with applications to image databases
, 1998
"... We introduce a new distance between two distributions that we call the Earth Mover’s Distance (EMD), which reflects the minimal amount of work that must be performed to transform one distributioninto the other by moving “distribution mass ” around. This is a special case of the transportation proble ..."
Abstract

Cited by 311 (4 self)
 Add to MetaCart
We introduce a new distance between two distributions that we call the Earth Mover’s Distance (EMD), which reflects the minimal amount of work that must be performed to transform one distributioninto the other by moving “distribution mass ” around. This is a special case of the transportation problem from linear optimization, for which efficient algorithms are available. The EMD also allows for partial matching. When used to compare distributions that have the same overall mass, the EMD is a true metric, and has easytocompute lower bounds. In this paper we focus on applications to image databases, especially color and texture. We use the EMD to exhibit the structure of colordistribution and texture spaces by means of MultiDimensional Scaling displays. We also propose a novel approach to the problem of navigating through a collection of color images, which leads to a new paradigm for image database search. 1
Image Representation Using 2D Gabor Wavelets
 IEEE Trans. Pattern Analysis and Machine Intelligence
, 1996
"... This paper extends to two dimensions the frame criterion developed by Daubechies for onedimensional wavelets, and it computes the frame bounds for the particular case of 2D Gabor wavelets. Completeness criteria for 2D Gabor image representations are important because of their increasing role in man ..."
Abstract

Cited by 264 (4 self)
 Add to MetaCart
This paper extends to two dimensions the frame criterion developed by Daubechies for onedimensional wavelets, and it computes the frame bounds for the particular case of 2D Gabor wavelets. Completeness criteria for 2D Gabor image representations are important because of their increasing role in many computer vision applications and also in modeling biological vision, since recent neurophysiological evidence from the visual cortex of mammalian brains suggests that the filter response profiles of the main class of linearlyresponding cortical neurons (called simple cells) are best modeled as a family of selfsimilar 2D Gabor wavelets. We therefore derive the conditions under which a set of continuous 2D Gabor wavelets will provide a complete representation of any image, and we also find selfsimilar wavelet parameterizations which allow stable reconstruction by summation as though the wavelets formed an orthonormal basis. Approximating a "tight frame" generates redundancy which allows lowresolution neural responses to represent highresolution images, as we illustrate by image reconstructions with severely quantized 2D Gabor coefficients. Index TermsGabor wavelets, coarse coding, image representation, visual cortex, image reconstruction.
Geodesic Active Regions and Level Set Methods for Supervised Texture Segmentation
 INTERNATIONAL JOURNAL OF COMPUTER VISION
, 2002
"... This paper presents a novel variational framework to deal with frame partition problems in Computer Vision. This framework exploits boundary and regionbased segmentation modules under a curvebased optimization objective function. The task of supervised texture segmentation is considered to demonst ..."
Abstract

Cited by 234 (8 self)
 Add to MetaCart
This paper presents a novel variational framework to deal with frame partition problems in Computer Vision. This framework exploits boundary and regionbased segmentation modules under a curvebased optimization objective function. The task of supervised texture segmentation is considered to demonstrate the potentials of the proposed framework. The textured feature space is generated by filtering the given textured images using isotropic and anisotropic filters, and analyzing their responses as multicomponent conditional probability density functions. The texture segmentation is obtained by unifying region and boundarybased information as an improved Geodesic Active Contour Model. The defined objective function is minimized using a gradientdescent method where a level set approach is used to implement the obtained PDE. According to this PDE, the curve propagation towards the final solution is guided by boundary and regionbased segmentation forces, and is constrained by a regularity force. The level set implementation is performed using a fast front propagation algorithm where topological changes are naturally handled. The performance of our method is demonstrated on a variety of synthetic and real textured frames.
Recognition without Correspondence using Multidimensional Receptive Field Histograms
 International Journal of Computer Vision
, 2000
"... . The appearance of an object is composed of local structure. This local structure can be described and characterized by a vector of local features measured by local operators such as Gaussian derivatives or Gabor filters. This article presents a technique where appearances of objects are represente ..."
Abstract

Cited by 209 (19 self)
 Add to MetaCart
. The appearance of an object is composed of local structure. This local structure can be described and characterized by a vector of local features measured by local operators such as Gaussian derivatives or Gabor filters. This article presents a technique where appearances of objects are represented by the joint statistics of such local neighborhood operators. As such, this represents a new class of appearance based techniques for computer vision. Based on joint statistics, the paper develops techniques for the identification of multiple objects at arbitrary positions and orientations in a cluttered scene. Experiments show that these techniques can identify over 100 objects in the presence of major occlusions. Most remarkably, the techniques have low complexity and therefore run in realtime. 1. Introduction The paper proposes a framework for the statistical representation of the appearance of arbitrary 3D objects. This representation consists of a probability density function or jo...
Efficient time series matching by wavelets
 Proc. of 15th Int'l Conf. on Data Engineering
, 1999
"... Time series stored as feature vectors can be indexed by multidimensional index trees like RTrees for fast retrieval. Due to the dimensionality curse problem, transformations are applied to time series to reduce the number of dimensions of the feature vectors. Different transformations like Discrete ..."
Abstract

Cited by 205 (1 self)
 Add to MetaCart
Time series stored as feature vectors can be indexed by multidimensional index trees like RTrees for fast retrieval. Due to the dimensionality curse problem, transformations are applied to time series to reduce the number of dimensions of the feature vectors. Different transformations like Discrete Fourier Transform (DFT), Discrete Wavelet Transform (DWT), KarhunenLoeve (KL) transform or Singular Value Decomposition (SVD) can be applied. While the use of DFT and KL transform or SVD have been studied in the literature, to our knowledge, there is no indepth study on the application of DWT. In this paper, we propose to use Haar Wavelet Transform for time series indexing. The major contributions are: (1) we show that Euclidean distance is preserved in the Haar transformed domain and no false dismissal will occur, (2) we show that Haar transform can outperform DFT through experiments, (3) a new similarity model is suggested to accommodate vertical shift of time series, and (4) a twophase method is proposed for efficientnearest neighbor query in time series databases. 1.
Robust object recognition with cortexlike mechanisms
 IEEE Trans. Pattern Analysis and Machine Intelligence
, 2007
"... Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating b ..."
Abstract

Cited by 202 (36 self)
 Add to MetaCart
Abstract—We introduce a new general framework for the recognition of complex visual scenes, which is motivated by biology: We describe a hierarchical system that closely follows the organization of visual cortex and builds an increasingly complex and invariant feature representation by alternating between a template matching and a maximum pooling operation. We demonstrate the strength of the approach on a range of recognition tasks: From invariant single object recognition in clutter to multiclass categorization problems and complex scene understanding tasks that rely on the recognition of both shapebased as well as texturebased objects. Given the biological constraints that the system had to satisfy, the approach performs surprisingly well: It has the capability of learning from only a few training examples and competes with stateoftheart systems. We also discuss the existence of a universal, redundant dictionary of features that could handle the recognition of most object categories. In addition to its relevance for computer vision, the success of this approach suggests a plausibility proof for a class of feedforward models of object recognition in cortex.