Results 1  10
of
18
Embedding Gestalt Laws in Markov Random Fields  a theory for shape modeling and perceptual organization
, 1999
"... The goal of this paper is to study a mathematical framework of 2D object shape modeling and learning for middle level vision problems, such as image segmentation and perceptual organization. For this purpose, we pursue generic shape models which characterize the most common features of 2D object sha ..."
Abstract

Cited by 66 (9 self)
 Add to MetaCart
The goal of this paper is to study a mathematical framework of 2D object shape modeling and learning for middle level vision problems, such as image segmentation and perceptual organization. For this purpose, we pursue generic shape models which characterize the most common features of 2D object shapes. In this paper, shape models are learned from observed natural shapes based on a minimax entropy learning theory (Zhu and Mumford 1997, Zhu, Wu and Mumford 1997)[31, 32]. The learned shape models are Gibbs distributions dened on Markov random elds (MRFs). The neighborhood structures of these MRFs correspond to Gestalt laws {colinearity, cocircularity, proximity, parallelism, and symmetry. Thus both contourbased and regionbased features are accounted for. Stochastic Markov chain Monte Carlo (MCMC) algorithms are proposed for learning and model verication. Furthermore, this paper provides a quantitative measure for the socalled nonaccidental statistics, and thus justies some empi...
Statistical Approaches to FeatureBased Object Recognition
, 1997
"... . This paper examines statistical approaches to modelbased object recognition. Evidence is presented indicating that, in some domains, normal (Gaussian) distributions are more accurate than uniform distributions for modeling feature fluctuations. This motivates the development of new maximumlikeli ..."
Abstract

Cited by 61 (1 self)
 Add to MetaCart
. This paper examines statistical approaches to modelbased object recognition. Evidence is presented indicating that, in some domains, normal (Gaussian) distributions are more accurate than uniform distributions for modeling feature fluctuations. This motivates the development of new maximumlikelihood and MAP recognition formulations which are based on normal feature models. These formulations lead to an expression for the posterior probability of the pose and correspondences given an image. Several avenues are explored for specifying a recognition hypothesis. In the first approach, correspondences are included as a part of the hypotheses. Search for solutions may be ordered as a combinatorial search in correspondence space, or as a search over pose space, where the same criterion can equivalently be viewed as a robust variant of chamfer matching. In the second approach, correspondences are not viewed as being a part of the hypotheses. This leads to a criterion that is a smooth funct...
Early completion of occluded objects
 Vision Research
, 1998
"... We show that early vision can use monocular cues to rapidly complete partiallyoccluded objects. Visual search for easilydetected fragments becomes difficult when the completed shape is similar to others in the display; conversely, search for fragments that are difficult to detect becomes easy when ..."
Abstract

Cited by 35 (10 self)
 Add to MetaCart
We show that early vision can use monocular cues to rapidly complete partiallyoccluded objects. Visual search for easilydetected fragments becomes difficult when the completed shape is similar to others in the display; conversely, search for fragments that are difficult to detect becomes easy when the completed shape is distinctive. Results indicate that completion occurs via the occlusiontriggered removal of occlusion edges and linking of associated regions. We fail to find evidence for a visible fillingin of contours or surfaces, but do find evidence for a &quot;functional &quot; fillingin that prevents the constituent fragments from being rapidly accessed. As such, it is only the completed structuresâ€”and not the fragments themselvesâ€”that serve as the basis for rapid recognition.
Recognition Using Region Correspondences
 International Journal of Computer Vision
, 1995
"... A central problem in object recognition is to determine the transformation that relates the model to the image, given some partial correspondence between the two. This is useful in determining whether an object is present in an image, and if so, determining where the object is. We present a novel me ..."
Abstract

Cited by 34 (7 self)
 Add to MetaCart
A central problem in object recognition is to determine the transformation that relates the model to the image, given some partial correspondence between the two. This is useful in determining whether an object is present in an image, and if so, determining where the object is. We present a novel method of solving this problem that uses region information. In our approach the model is divided into volumes, and the image is divided into regions. Given a match between subsets of volumes and regions (without any explicit correspondence between different pieces of the regions) the alignment transformation is computed. The method applies to planar objects under similarity, affine, and projective transformations and to projections of 3D objects undergoing affine and projective transformations. 1 Introduction A fundamental problem in recognition is pose estimation. Given a correspondence between some portions of an object model and some portions of an image, determine the transformation th...
Extracting Salient Curves from Images: An Analysis of the Saliency Network
, 1998
"... The Saliency Network proposed by Shashua and Ullman (1988) is a wellknown approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally ..."
Abstract

Cited by 28 (3 self)
 Add to MetaCart
The Saliency Network proposed by Shashua and Ullman (1988) is a wellknown approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally prefers long and smooth curves over short or wiggly ones. While computing saliencies, the network also fills in gaps with smooth completions and tolerates noise. Finally, the network is locally connected, and its size is proportional to the size of the image. Nevertheless, our analysis reveals certain weaknesses with the method. In particular, we show cases in which the most salient element does not lie on the perceptually most salient curve. Furthermore, in some cases the saliency measure changes its preferences when curves are scaled uniformly. Also, we show that for certain fragmented curves the measure prefers large gaps over a few small gaps of the same total size. In addition, we analyze the time complexity required by the method. We show that the number of steps required for convergence in serial implementations is quadratic in the size of the network, and in parallel implementations is linear in the size of the network. We discuss problems due to coarse sampling of the range of possible orientations. Finally, we consider the possibility of using the Saliency Network for grouping. We show that the Saliency Network recovers the most salient curve efficiently, but it has problems with identifying any salient curve other than the most salient one.
Uncertainty Propagation in ModelBased Recognition
 International Journal of Computer Vision
, 1994
"... Building robust recognition systems requires a careful understanding of the effects of error in sensed features. In modelbased recognition, matches between model features and sensed image features typically are used to compute a model pose and then project the unmatched model features into the im ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Building robust recognition systems requires a careful understanding of the effects of error in sensed features. In modelbased recognition, matches between model features and sensed image features typically are used to compute a model pose and then project the unmatched model features into the image. The error in the image features results in uncertainty in the projected model features. We first show how error propagates when poses are based on three pairs of model and image points. In particular, we show how to simply and efficiently compute the region in the image where an unmatched model point might appear, for both Gaussian and bounded error in the detection of image points, and for both scaledorthographic and perspective projection models. This result applies to objects that are fully threedimensional, where past results considered only twodimensional objects. The result is based on an approximation that accurately linearizes the relationship between matched image points and unmatched, projected model points. Secondly, based on the linear approximation, we show how we can utilize linear programming to compute the propagated error region for any number of initial matches. Finally, we use these results to extend, from twodimensional to threedimensional objects, robust implementations of alignment, interpretationtree search, and transformation clustering.
Indexing Based on Algebraic Functions of Views
 COMPUTER VISION AND IMAGE UNDERSTANDING
, 1998
"... this paper, we propose the use of algebraic functions of views for indexingbased object recognition. During indexing, we consider groups of model points and we represent all the views (i.e., images) that they can produce in a hash table. The images that a group of model points can produce are compu ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
this paper, we propose the use of algebraic functions of views for indexingbased object recognition. During indexing, we consider groups of model points and we represent all the views (i.e., images) that they can produce in a hash table. The images that a group of model points can produce are computed by combining a small number of reference views which contain the group using algebraic functions of views. Fundamental to this procedure is a methodology, based on Singular Value Decomposition and Interval Arithmetic, for estimating the allowable ranges of values that the parameters of algebraic functions can assume. During recognition, scene groups are used to retrieve from the hash table the most feasible model groups that might have produced the scene groups. The use of algebraic functions of views for indexingbased recognition offers a number of advantages. First of all, the hash table can be built using a small number of reference views per object. This is in contrast to current approaches which build the hash table using either a large number of reference views or 3D models. Most importantly, recognition does not rely on the similarity between reference views and novel views; all that is required for the novel views is to contain common groups of points with a small number of reference views. Second, verification becomes simpler. This is because candidate models can now be backprojected onto the scene by applying a linear transformationona small number of reference views of the candidate model. Finally, the proposed approach is more general and extendible. This is because algebraic functions of views have been shown to exist over a wide range of transformations and projections. The recognition performance of the proposed approach is demonstrated using both artific...
Computational Models of Perceptual Organization
 Robotics Institute, Carnegie Mellon University
, 2003
"... Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chickenandegg nature between the various subprocesses such as image segmentation, figureground segregation and object recognitio ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
Perceptual organization refers to the process of organizing sensory input into coherent and interpretable perceptual structures. This process is challenging due to the chickenandegg nature between the various subprocesses such as image segmentation, figureground segregation and object recognition. Lowlevel processing requires the guidance of highlevel knowledge to overcome noise; while highlevel processing relies on lowlevel processes to reduce the computational complexity. Neither process can be sufficient on its own. Consequently, any system that carries out these processes in a sequence is bound to be brittle. An alternative system is one in which all processes interact with each other simultaneously. In this thesis, we develop a set of simple yet realistic interactive processing models for perceptual organization. We model the processing in the framework of spectral graph theory, with a criterion encoding the overall goodness of perceptual organization. We derive fast solutions for nearglobal optima of the criterion, and demonstrate the efficacy of the models on segmenting a wide range of real images. Through these models, we are able to capture a variety of perceptual phenomena: a unified treatment of various grouping, figureground and depth cues to produce popout, region segmentation and depth segregation in one step; and a unified framework for integrating bottomup and topdown information to produce an object segmentation from spatial and object attention. We achieve these goals by empowering current spectral graph methods with a principled solution for multiclass spectral graph partitioning; expanded repertoire of grouping cues to include similarity, dissimilarity and ordering relationships; a theory for integrating sparse grouping cues; and a model ...
3D to 2D Recognition with Regions
 IEEE Conference on Computer Vision and Pattern Recognition
, 1997
"... This paper presents a novel approach to partsbased object recognition in the presence of occlusion. We focus on the problem of determining the pose of a 3D object from a single 2D image when convex parts of the object have been matched to corresponding regions in the image. We consider three t ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
This paper presents a novel approach to partsbased object recognition in the presence of occlusion. We focus on the problem of determining the pose of a 3D object from a single 2D image when convex parts of the object have been matched to corresponding regions in the image. We consider three types of occlusions: selfocclusion, occlusions whose locus is identified in the image, and completely arbitrary occlusions. We derive efficient algorithms for the first two cases, and characterize their performance. For the last case, we prove that the problem of finding valid poses is computationally hard, but provide an efficient, approximate algorithm. This work generalizes our previous work on regionbased object recognition, which focused on the case of planar models. A preliminary version of this paper has appeared in [29] A brief overview of these and related results has appeared in [8] y This research was supported by the Unites StatesIsrael Binational Science Foundation, Gr...