Results 1  10
of
352
A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics
 in Proc. 8th Int’l Conf. Computer Vision
, 2001
"... This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the s ..."
Abstract

Cited by 953 (14 self)
 Add to MetaCart
(Show Context)
This paper presents a database containing ‘ground truth ’ segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties. 1.
Fast and robust fixedpoint algorithms for independent component analysis
 IEEE TRANS. NEURAL NETW
, 1999
"... Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. In this paper, we use a combination of two different approaches for linear ICA: Comon’s informat ..."
Abstract

Cited by 885 (34 self)
 Add to MetaCart
(Show Context)
Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. In this paper, we use a combination of two different approaches for linear ICA: Comon’s informationtheoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast (objective) functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixedpoint algorithms for practical optimization of the contrast functions. These algorithms optimize the contrast functions very fast and reliably.
Efficient sparse coding algorithms
 In NIPS
, 2007
"... Sparse coding provides a class of algorithms for finding succinct representations of stimuli; given only unlabeled input data, it discovers basis functions that capture higherlevel features in the data. However, finding sparse codes remains a very difficult computational problem. In this paper, we ..."
Abstract

Cited by 445 (14 self)
 Add to MetaCart
(Show Context)
Sparse coding provides a class of algorithms for finding succinct representations of stimuli; given only unlabeled input data, it discovers basis functions that capture higherlevel features in the data. However, finding sparse codes remains a very difficult computational problem. In this paper, we present efficient sparse coding algorithms that are based on iteratively solving two convex optimization problems: an L1regularized least squares problem and an L2constrained least squares problem. We propose novel algorithms to solve both of these optimization problems. Our algorithms result in a significant speedup for sparse coding, allowing us to learn larger sparse codes than possible with previously described algorithms. We apply these algorithms to natural images and demonstrate that the inferred sparse codes exhibit endstopping and nonclassical receptive field surround suppression and, therefore, may provide a partial explanation for these two phenomena in V1 neurons. 1
Emergence of Phase and ShiftInvariant Features by Decomposition of Natural Images into Independent Feature Subspaces
, 2000
"... this article, we show that the same principle of independence maximization can explain the emergence of phase and shiftinvariant features, similar to those found in complex cells. This new kind of emergence is obtained by maximizing the independence between norms of projections on linear subspaces ..."
Abstract

Cited by 203 (31 self)
 Add to MetaCart
this article, we show that the same principle of independence maximization can explain the emergence of phase and shiftinvariant features, similar to those found in complex cells. This new kind of emergence is obtained by maximizing the independence between norms of projections on linear subspaces (instead of the independence of simple linear filter outputs). Thenorms of the projections on such "independent feature subspaces" then indicate the values of invariant features
Sparse deep belief net model for visual area V2
 Advances in Neural Information Processing Systems 20
, 2008
"... Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed ..."
Abstract

Cited by 162 (19 self)
 Add to MetaCart
Abstract 1 Motivated in part by the hierarchical organization of the neocortex, a number of recently proposed algorithms have tried to learn hierarchical, or “deep, ” structure from unlabeled data. While several authors have formally or informally compared their algorithms to computations performed in visual area V1 (and the cochlea), little attempt has been made thus far to evaluate these algorithms in terms of their fidelity for mimicking computations at deeper levels in the cortical hierarchy. This thesis describes an unsupervised learning model that faithfully mimics certain properties of visual area V2. Specifically, we develop a sparse variant of the deep belief networks described by Hinton et al. (2006). We learn two layers of representation in the network, and demonstrate that the first layer, similar to prior work on sparse coding and ICA, results in localized, oriented, edge filters, similar to the gabor functions known to model simple cell receptive fields in area V1. Further, the second layer in our model encodes various combinations of the first layer responses in the data. Specifically, it picks up both collinear (“contour”) features as well as corners and junctions. More interestingly, in a quantitative comparison, the encoding of these more complex “corner ” features matches well with the results from Ito & Komatsu’s study of neural responses to angular stimuli in area V2 of the macaque. This suggests that our sparse variant of deep belief networks holds promise for modeling more higherorder features that are encoded in visual cortex. Conversely, one may also interpret the results reported here as suggestive that visual area V2 is performing computations on its input similar to those performed in (sparse) deep belief networks. This plausible relationship generates some intriguing hypotheses about V2 computations. 1 This thesis is an extended version of an earlier paper by Honglak Lee, Chaitanya Ekanadham, and Andrew Ng titled “Sparse deep belief net model for visual area V2.” 1
Probabilistic framework for the adaptation and comparison of image codes
 J. OPT. SOC. AM. A
, 1999
"... We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing ..."
Abstract

Cited by 141 (10 self)
 Add to MetaCart
(Show Context)
We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing the number of basis functions results in a greater sampling density in position, orientation, and scale. These properties also resemble the spatial receptive fields of neurons in the primary visual cortex of mammals, suggesting that the receptivefield structure of these neurons can be accounted for by a general efficient coding principle. The probabilistic framework provides a method for comparing the coding efficiency of different bases objectively by calculating their probability given the observed data or by measuring the entropy of the basis function coefficients. The learned bases are shown to have better coding efficiency than traditional Fourier and wavelet bases. This framework also provides a Bayesian solution to the problems of image denoising and filling in of missing pixels. We demonstrate that the results obtained by applying the learned bases to these problems are improved over those obtained with traditional techniques.
Topology and Data
, 2008
"... An important feature of modern science and engineering is that data of various kinds is being produced at an unprecedented rate. This is so in part because of new experimental methods, and in part because of the increase in the availability of high powered computing technology. It is also clear that ..."
Abstract

Cited by 119 (4 self)
 Add to MetaCart
(Show Context)
An important feature of modern science and engineering is that data of various kinds is being produced at an unprecedented rate. This is so in part because of new experimental methods, and in part because of the increase in the availability of high powered computing technology. It is also clear that the nature of the data
The nonlinear statistics of highcontrast patches in natural images
 International Journal of Computer Vision
"... (Article begins on next page) The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters. ..."
Abstract

Cited by 116 (3 self)
 Add to MetaCart
(Article begins on next page) The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters.
Components of bottomup gaze allocation in natural images
, 2005
"... ... showed that a model of bottomup visual attention can account in part for the spatial locations fixated by humans while freeviewing complex natural and artificial scenes. That study used a definition of salience based on local detectors with coarse global surround inhibition. Here, we use a sim ..."
Abstract

Cited by 110 (18 self)
 Add to MetaCart
... showed that a model of bottomup visual attention can account in part for the spatial locations fixated by humans while freeviewing complex natural and artificial scenes. That study used a definition of salience based on local detectors with coarse global surround inhibition. Here, we use a similar framework to investigate the roles of several types of nonlinear interactions known to exist in visual cortex, and of eccentricitydependent processing. For each of these, we added a component to the salience model, including richer interactions among orientationtuned units, both at spatial short range (for clutter reduction) and long range (for contour facilitation), and a detailed model of eccentricitydependent changes in visual processing. Subjects freeviewed naturalistic and artificial images while their eye movements were recorded, and the resulting fixation locations were compared with the modelsÕ predicted salience maps. We found that the proposed interactions indeed play a significant role in the spatiotemporal deployment of attention in natural scenes; about half of the observed intersubject variance can be explained by these different models. This suggests that attentional guidance does not depend solely on local visual features, but must also include the effects of interactions among features. As models of these interactions become more accurate in predicting behaviorallyrelevant salient locations, they become useful to a range of applications in computer vision and humanmachine interface design.
A TwoLayer Sparse Coding Model Learns Simple and Complex Cell Receptive Fields and Topography From Natural Images
 VISION RESEARCH
, 2001
"... The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can ..."
Abstract

Cited by 109 (16 self)
 Add to MetaCart
The classical receptive fields of simple cells in the visual cortex have been shown to emerge from the statistical properties of natural images by forcing the cell responses to be maximally sparse, i.e. significantly activated only rarely. Here, we show that this single principle of sparseness can also lead to emergence of topography (columnar organization) and complex cell properties as well. These are obtained by maximizing the sparsenesses of locally pooled energies, which correspond to complex cell outputs. Thus we obtain a highly parsimonious model of how these properties of the visual cortex are adapted to the characteristics of the natural input.