Results 1 - 10
of
55
Fast approximate energy minimization via graph cuts
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2001
"... In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when v ..."
Abstract
-
Cited by 905 (38 self)
- Add to MetaCart
In this paper we address the problem of minimizing a large class of energy functions that occur in early vision. The major restriction is that the energy function’s smoothness term must only involve pairs of pixels. We propose two algorithms that use graph cuts to compute a local minimum even when very large moves are allowed. The first move we consider is an α-βswap: for a pair of labels α, β, this move exchanges the labels between an arbitrary set of pixels labeled α and another arbitrary set labeled β. Our first algorithm generates a labeling such that there is no swap move that decreases the energy. The second move we consider is an α-expansion: for a label α, this move assigns an arbitrary set of pixels the label α. Our second
Bayesian Interpolation
- Neural Computation
, 1991
"... Although Bayesian analysis has been in use since Laplace, the Bayesian method of model--comparison has only recently been developed in depth. In this paper, the Bayesian approach to regularisation and model--comparison is demonstrated by studying the inference problem of interpolating noisy data. T ..."
Abstract
-
Cited by 417 (17 self)
- Add to MetaCart
Although Bayesian analysis has been in use since Laplace, the Bayesian method of model--comparison has only recently been developed in depth. In this paper, the Bayesian approach to regularisation and model--comparison is demonstrated by studying the inference problem of interpolating noisy data. The concepts and methods described are quite general and can be applied to many other problems. Regularising constants are set by examining their posterior probability distribution. Alternative regularisers (priors) and alternative basis sets are objectively compared by evaluating the evidence for them. `Occam's razor' is automatically embodied by this framework. The way in which Bayes infers the values of regularising constants and noise levels has an elegant interpretation in terms of the effective number of parameters determined by the data set. This framework is due to Gull and Skilling. 1 Data modelling and Occam's razor In science, a central task is to develop and compare models to a...
Object-Centered Surface Reconstruction: Combining Multi-Image Stereo and Shading
- International Journal of Computer Vision
, 1995
"... Our goal is to reconstruct both the shape and reflectance properties of surfaces from multiple images. We argue that an object-centered representation is most appropriate for this purpose because it naturally accommodates multiple sources of data, multiple images (including motion sequences of a rig ..."
Abstract
-
Cited by 103 (19 self)
- Add to MetaCart
Our goal is to reconstruct both the shape and reflectance properties of surfaces from multiple images. We argue that an object-centered representation is most appropriate for this purpose because it naturally accommodates multiple sources of data, multiple images (including motion sequences of a rigid object), and self-occlusions. We then present a specific objectcentered reconstruction method and its implementation. The method begins with an initial estimate of surface shape provided, for example, by triangulating the result of conventional stereo. The surface shape and reflectance properties are then iteratively adjusted to minimize an objective function that combines information from multiple input images. The objective function is a weighted sum of stereo, shading, and smoothness components, where the weight varies over the surface. For example, the stereo component is weighted more strongly where the surface projects onto highly textured areas in the images, and less strongly othe...
Hierarchic Voronoi Skeletons
, 1995
"... Robust and time-efficient skeletonization of a (planar) shape, which is connectivity preserving and based on Euclidean metrics, can be achieved by first regularizing the Voronoi diagram (VD) of a shape's boundary points, i.e., by removal of noise-sensitive parts of the tessellation and then by estab ..."
Abstract
-
Cited by 100 (3 self)
- Add to MetaCart
Robust and time-efficient skeletonization of a (planar) shape, which is connectivity preserving and based on Euclidean metrics, can be achieved by first regularizing the Voronoi diagram (VD) of a shape's boundary points, i.e., by removal of noise-sensitive parts of the tessellation and then by establishing a hierarchic organization of skeleton constituents. Each component of the VD is attributed with a measure of prominence which exhibits the expected invariance under geometric transformations and noise. The second processing step, a hierarchic clustering of skeleton branches, leads to a multiresolution representation of the skeleton, termed skeleton pyramid.
Single Lens Stereo with a Plenoptic Camera
, 1992
"... Ordinary cameras gather light across the area of their lens aperture, and the light striking a given subregion of the aperture is structured somewhat differently than the light striking an adjacent subregion. By analyzing this optical structure, one can infer the depths of objects in the scene, i.e. ..."
Abstract
-
Cited by 88 (0 self)
- Add to MetaCart
Ordinary cameras gather light across the area of their lens aperture, and the light striking a given subregion of the aperture is structured somewhat differently than the light striking an adjacent subregion. By analyzing this optical structure, one can infer the depths of objects in the scene, i.e., one can achieve "single lens stereo." We describe a novel camera for performing this analysis. It incorporates a single main lens along with a lenticular array placed at the sensor plane. The resulting "plenoptic camera" provides information about how the scene would look when viewed from a continuum of possible viewpoints bounded by the main lens aperture. Deriving depth information is simpler than in a binocular stereo system because the correspondence problem is minimized. The camera extracts information about both horizontal and vertical parallax, which improves the reliability of the depth estimates.
Efficient Graph-Based Energy Minimization Methods In Computer Vision
, 1999
"... ms (we show that exact minimization in NP-hard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. Th ..."
Abstract
-
Cited by 63 (4 self)
- Add to MetaCart
ms (we show that exact minimization in NP-hard in these cases). These algorithms produce a local minimum in interesting large move spaces. Furthermore, one of them nds a solution within a known factor from the optimum. The algorithms are iterative and compute several graph cuts at each iteration. The running time at each iteration is eectively linear due to the special graph structure. In practice it takes just a few iterations to converge. Moreover most of the progress happens during the rst iteration. For a certain piecewise constant prior we adapt the algorithms developed for the piecewise smooth prior. One of them nds a solution within a factor of two from the optimum. In addition we develop a third algorithm which nds a local minimum in yet another move space. We demonstrate the eectiveness of our approach on image restoration, stereo, and motion. For the data with ground truth, our methods signicantly outperform standard methods. Biographical Sketch Olga
A self-organizing multiple-view representation of 3D objects
, 1991
"... We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a two-layer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, the network learned to recognize ten objects ..."
Abstract
-
Cited by 55 (15 self)
- Add to MetaCart
We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a two-layer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, the network learned to recognize ten objects from different viewpoints. The training process led to the emergence of compact representations of the specific input views. When tested on novel views of the same objects, the network exhibited a substantial generalization capa- bility. In simulated psychophysical experiments, the network's behavior was qualitatively similar to that of human subjects.
A theory of cortical responses
, 2005
"... This article concerns the nature of evoked brain responses and the principles underlying their generation. We start with the premise that the sensory brain has evolved to represent or infer the causes of changes in its sensory inputs. The problem of inference is well formulated in statistical terms. ..."
Abstract
-
Cited by 46 (16 self)
- Add to MetaCart
This article concerns the nature of evoked brain responses and the principles underlying their generation. We start with the premise that the sensory brain has evolved to represent or infer the causes of changes in its sensory inputs. The problem of inference is well formulated in statistical terms. The statistical fundaments of inference may therefore afford important constraints on neuronal implementation. By formulating the original ideas of Helmholtz on perception, in terms of modern-day statistical theories, one arrives at a model of perceptual inference and learning that can explain a remarkable range of neurobiological facts. It turns out that the problems of inferring the causes of sensory input (perceptual inference) and learning the relationship between input and cause (perceptual learning) can be resolved using exactly the same principle. Specifically, both inference and learning rest on minimizing the brain’s free energy, as defined in statistical physics. Furthermore, inference and learning can proceed in a biologically plausible fashion. Cortical responses can be seen as the brain’s attempt to minimize the free energy induced by a stimulus and thereby encode the most likely cause of that stimulus. Similarly, learning emerges from changes in synaptic efficacy that minimize the free energy, averaged over all stimuli encountered. The underlying scheme rests on empirical Bayes and hierarchical models
Slow and Smooth: a Bayesian theory for the combination of of local motion signals in human vision
, 1998
"... In order to estimate the motion of an object, the visual system needs to combine multiple local measurements, each of which carries some degree of ambiguity. We present a model of motion perception whereby measurements from dierent image regions are combined according to a Bayesian estimator: the ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
In order to estimate the motion of an object, the visual system needs to combine multiple local measurements, each of which carries some degree of ambiguity. We present a model of motion perception whereby measurements from dierent image regions are combined according to a Bayesian estimator: the estimated motion maximizes the posterior probability assuming a prior favoring slow and smooth velocities. In reviewing a large number of previously published phenomena we nd that the Bayesian estimator predicts a wide range of psychophysical results. This suggests that the seemingly complex set of illusions arise from a single computational strategy that is optimal under reasonable assumptions. 1 Introduction Estimating motion in scenes containing multiple, complex motions remains a dicult problem for computer vision systems, yet is performed eortlessly by human observers. Motion analysis in such scenes imposes conicting demands on the design of a vision system (Braddick, 1993)....
Complex feature recognition: A bayesian approach for learning to recognize objects
- AI Memo 1591, Massachusetts Institute of Technology
, 1996
"... This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. We have developed a new Bayesian framework for visual object recognition which is based on the insight that images of objects can be modeled as a conjunction of local features. This framework can be used to both derive an ..."
Abstract
-
Cited by 32 (3 self)
- Add to MetaCart
This publication can be retrieved by anonymous ftp to publications.ai.mit.edu. We have developed a new Bayesian framework for visual object recognition which is based on the insight that images of objects can be modeled as a conjunction of local features. This framework can be used to both derive an object recognition algorithm and an algorithm for learning the features themselves. The overall approach, called complex feature recognition or CFR, is unique for several reasons: it is broadly applicable to a wide range of object types, it makes constructing object models easy, it is capable of identifying either the class or the identity of an object, and it is computationally efficient – requiring time proportional to the size of the image. Instead of a single simple feature such as an edge, CFR uses a large set of complex features that are learned from experience with model objects. The response of a single complex feature contains much more class information than does a single edge. This significantly reduces the number of possible correspondences between the model and the image. In addition, CFR takes advantage of a type of image processing called oriented energy. Oriented energy is used to efficiently pre-process the image to eliminate some of the difficulties associated with changes in lighting and pose.

