Results 1 - 10
of
31
Learning to detect natural image boundaries using local brightness, color, and texture cues
- PAMI
, 2004
"... Abstract—The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from ..."
Abstract
-
Cited by 266 (16 self)
- Add to MetaCart
Abstract—The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness, color, and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, we train a classifier using human labeled images as ground truth. The output of this classifier provides the posterior probability of a boundary at each image location and orientation. We present precision-recall curves showing that the resulting detector significantly outperforms existing approaches. Our two main results are 1) that cue combination can be performed adequately with a simple linear model and 2) that a proper, explicit treatment of texture is required to detect boundaries in natural images. Index Terms—Texture, supervised learning, cue combination, natural images, ground truth segmentation data set, boundary detection, boundary localization. 1
Fragment-based image completion
- ACM TRANS. ON GRAPHICS. SPECIAL ISSUE: PROC. OF ACM SIGGRAPH
, 2003
"... We present a new method for completing missing parts caused by the removal of foreground or background elements from an image. Our goal is to synthesize a complete, visually plausible and coherent image. The visible parts of the image serve as a training set to infer the unknown parts. Our method it ..."
Abstract
-
Cited by 62 (3 self)
- Add to MetaCart
We present a new method for completing missing parts caused by the removal of foreground or background elements from an image. Our goal is to synthesize a complete, visually plausible and coherent image. The visible parts of the image serve as a training set to infer the unknown parts. Our method iteratively approximates the unknown regions and composites adaptive image fragments into the image. Values of an inverse matte are used to compute a confidence map and a level set that direct an incremental traversal within the unknown area from high to low confidence. In each step, guided by a fast smooth approximation, an image fragment is selected from the most similar and frequent examples. As the selected fragments are composited, their likelihood increases along with the mean confidence of the image, until reaching a complete image. We demonstrate our method by completion of photographs and paintings.
Depth estimation from image structure
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2002
"... AbstractÐIn the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges, and junctions may provide a 3D model of the scene but it will not provide infor ..."
Abstract
-
Cited by 49 (9 self)
- Add to MetaCart
AbstractÐIn the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges, and junctions may provide a 3D model of the scene but it will not provide information about the actual ªscaleº of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, object recognition, under unconstrained conditions, remains difficult and unreliable for current computational approaches. Here, we propose a source of information for absolute depth estimation based on the whole scene structure that does not rely on specific objects. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene and, therefore, its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection. Index TermsÐDepth, image statistics, scene structure, scene recognition, scale selection, monocular vision. 1
Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches
- In Proc. IEEE Conf. Comput. Vision and Pattern Recognition
, 2003
"... This paper studies the problem of combining region and boundary cues for natural image segmentation. We employ a large database of manually segmented images in order to learn an optimal affinity function between pairs of pixels. These pairwise affinities can then be used to cluster the pixels into v ..."
Abstract
-
Cited by 41 (4 self)
- Add to MetaCart
This paper studies the problem of combining region and boundary cues for natural image segmentation. We employ a large database of manually segmented images in order to learn an optimal affinity function between pairs of pixels. These pairwise affinities can then be used to cluster the pixels into visually coherent groups. Region cues are computed as the similarity in brightness, color, and texture between image patches. Boundary cues are incorporated by looking for the presence of an “intervening contour”, a large gradient along a straight line connecting two pixels. We first use the dataset of human segmentations to individually optimize parameters of the patch and gradient features for brightness, color, and texture cues. We then quantitatively measure the power of different feature combinations by computing the precision and recall of classifiers trained using those features. The mutual information between the output of the classifiers and the same-segment indicator function provides an alternative evaluation technique that yields identical conclusions. As expected, the best classifier makes use of brightness, color, and texture features, in both patch and gradient forms. We find that for brightness, the gradient cue outperforms the patch similarity. In contrast, using color patch similarity yields better results than using color gradients. Texture is the most powerful of the three channels, with both patches and gradients carrying significant independent information. Interestingly, the proximity of the two pixels does not add any information beyond that provided by the similarity cues. We also find that the convexity assumptions made by the intervening contour approach are supported by the ecological statistics of the dataset. 1.
Issues in the convergence of control with communication and computation
- Proceedings of the 43rd IEEE Conference on Decision and Control
, 2004
"... Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the United States ..."
Abstract
-
Cited by 22 (6 self)
- Add to MetaCart
Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author and do not necessarily reflect the views of the United States
Neural events and perceptual awareness
- COGNITION
, 2001
"... Neural correlates of perceptual awareness, until very recently an elusive quarry, are now almost commonplace findings. This article first describes a variety of neural correlates of perceptual awareness based on fMRI, ERPs, and single-unit recordings. It is then argued that our quest should ultimate ..."
Abstract
-
Cited by 21 (0 self)
- Add to MetaCart
Neural correlates of perceptual awareness, until very recently an elusive quarry, are now almost commonplace findings. This article first describes a variety of neural correlates of perceptual awareness based on fMRI, ERPs, and single-unit recordings. It is then argued that our quest should ultimately focus not on mere correlates of awareness, but rather on the neural events that are both necessary and sufficient for perceptual awareness. Indeed, preliminary evidence suggests that although many of the neural correlates already reported may be necessary for the corresponding state of awareness, it is unlikely that they are sufficient for it. The final section considers three hypotheses concerning the possible sufficiency conditions
Locally Bayesian Learning with Applications to Retrospective Revaluation and Highlighting
- Psychological Review
, 2006
"... A scheme is described for locally Bayesian parameter updating in models structured as successions of component functions. The essential idea is to back-propagate the target data to interior modules, such that an interior component’s target is the input to the next component that maximizes the probab ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
A scheme is described for locally Bayesian parameter updating in models structured as successions of component functions. The essential idea is to back-propagate the target data to interior modules, such that an interior component’s target is the input to the next component that maximizes the probability of the next component’s target. Each layer then does locally Bayesian learning. The approach assumes online trial-by-trial learning. The resulting parameter updating is not globally Bayesian but can better capture human behavior. The approach is implemented for an associative learning model that first maps inputs to attentionally filtered inputs and then maps attentionally filtered inputs to outputs. The Bayesian updating allows the associative model to exhibit retrospective revaluation effects such as backward blocking and unovershadowing, which have been challenging for associative learning models. The back-propagation of target values to attention allows the model to show trial-order effects, including highlighting and differences in magnitude of forward and backward blocking, which have been challenging for Bayesian learning models.
Model order selection and cue combination for image segmentation
- In CVPR
, 2006
"... Model order selection and cue combination are both difficult open problems in the area of clustering. In this work we build upon stability-based approaches to develop a new method for automatic model order selection and cue combination with applications to visual grouping. Novel features of our appr ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Model order selection and cue combination are both difficult open problems in the area of clustering. In this work we build upon stability-based approaches to develop a new method for automatic model order selection and cue combination with applications to visual grouping. Novel features of our approach include the ability to detect multiple stable clusterings (instead of only one), a simpler means of calculating stability that does not require training a classifier, and a new characterization of the space of stabilities for a continuum of segmentations that provides for an efficient sampling scheme. Our contribution is a framework for visual grouping that frees the user from the hassles of parameter tuning and model order selection: the input is an image, the output is a shortlist of segmentations. 1.
Scene analysis by integrating primitive segmentation and associative memory
- IEEE Transactions on Systems, Man, and Cybernetics Part B
, 2002
"... Abstract—Scene analysis is a major aspect of perception and continues to challenge machine perception. This paper addresses the scene-analysis problem by integrating a primitive segmentation stage with a model of associative memory. Our model is a multistage system that consists of an initial primit ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
Abstract—Scene analysis is a major aspect of perception and continues to challenge machine perception. This paper addresses the scene-analysis problem by integrating a primitive segmentation stage with a model of associative memory. Our model is a multistage system that consists of an initial primitive segmentation stage, a multimodule associative memory, and a short-term memory (STM) layer. Primitive segmentation is performed by locally excitatory globally inhibitory oscillator network (LEGION), which segments the input scene into multiple parts that correspond to groups of synchronous oscillations. Each segment triggers memory recall and multiple recalled patterns then interact with one another in the STM layer. The STM layer projects to the LEGION network, giving rise to memory-based grouping and segmentation. The system achieves scene analysis entirely in phase space, which provides a unifying mechanism for both bottom-up analysis and top-down analysis. The model is evaluated with a systematic set of three-dimensional (3-D) line drawing objects, which are arranged in an arbitrary fashion to compose input scenes that allow object occlusion. Memory-based organization is responsible for a significant improvement in performance. A number of issues are discussed, including input-anchored alignment, top-down organization, and the role of STM in producing context sensitivity of memory recall. Index Terms—Associative memory, grouping, integration, locally excitatory globally inhibitory oscillator network (LEGION), scene analysis, segmentation, short-term memory (STM). I.
Efficient Visual Search without Top-down or Bottom-up Guidance: A Putative Role for Perceptual Organization
, 2001
"... Two types of mechanisms have dominated theoretical accounts of efficient visual search. First are bottom-up processes related to the hypothesized characteristics of retinotopic feature maps. Second are top-down mechanisms related to feature selection. Little effort has been made to understand visual ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Two types of mechanisms have dominated theoretical accounts of efficient visual search. First are bottom-up processes related to the hypothesized characteristics of retinotopic feature maps. Second are top-down mechanisms related to feature selection. Little effort has been made to understand visual search in terms of perceptual grouping despite its acknowledged importance in general visual perception. To examine the possible role of perceptual grouping we employ a new search paradigm whereby a target is defined only in a context-dependent manner by multiple conjunctions of feature dimensions. Because targets in a multi-conjunction task cannot be distinguished from distractors either by bottom-up guidance or top-down guidance, current theories of visual search predict inefficient search. While inefficient search does occur for the multiple conjunctions of orientation with color or luminance, we found efficient search for multiple conjunctions of luminance with size, shape, and topological properties. We also show that repeated presentations of either targets or a set of distractors result in much faster performance. Our results suggest that perceptual organization can play a decisive role in visual search, and theories of visual attention need to take this into account. Furthermore, multiconjunction search may provide a new vehicle for investigating perceptual grouping and scene analysis.

