Results 1  10
of
29
Learning lowlevel vision
 International Journal of Computer Vision
, 2000
"... We show a learningbased method for lowlevel vision problems. We setup a Markov network of patches of the image and the underlying scene. A factorization approximation allows us to easily learn the parameters of the Markov network from synthetic examples of image/scene pairs, and to e ciently prop ..."
Abstract

Cited by 469 (25 self)
 Add to MetaCart
We show a learningbased method for lowlevel vision problems. We setup a Markov network of patches of the image and the underlying scene. A factorization approximation allows us to easily learn the parameters of the Markov network from synthetic examples of image/scene pairs, and to e ciently propagate image information. Monte Carlo simulations justify this approximation. We apply this to the \superresolution " problem (estimating high frequency details from a lowresolution image), showing good results. For the motion estimation problem, we show resolution of the aperture problem and llingin arising from application of the same probabilistic machinery.
Regularization Theory and Neural Networks Architectures
 Neural Computation
, 1995
"... We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Ba ..."
Abstract

Cited by 314 (31 self)
 Add to MetaCart
We had previously shown that regularization principles lead to approximation schemes which are equivalent to networks with one layer of hidden units, called Regularization Networks. In particular, standard smoothness functionals lead to a subclass of regularization networks, the well known Radial Basis Functions approximation schemes. This paper shows that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models and some of the neural networks. In particular, we introduce new classes of smoothness functionals that lead to different classes of basis functions. Additive splines as well as some tensor product splines can be obtained from appropriate classes of smoothness functionals. Furthermore, the same generalization that extends Radial Basis Functions (RBF) to Hyper Basis Functions (HBF) also leads from additive models to ridge approximation models, containing as special cases Breiman's hinge functions, som...
Linear Object Classes and Image Synthesis From a Single Example Image
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 1997
"... Abstract—The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced [1], [2], [3] simpler techniques that are applicable under r ..."
Abstract

Cited by 200 (23 self)
 Add to MetaCart
Abstract—The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced [1], [2], [3] simpler techniques that are applicable under restricted conditions. The approach exploits image transformations that are specific to the relevant object class, and learnable from example views of other “prototypical ” objects of the same class. In this paper, we introduce such a technique by extending the notion of linear class proposed by Poggio and Vetter. For linear object classes, it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively “rotate ” highresolution face images from a single 2D view. Index Terms—3D object recognition, rotation invariance, deformable templates, image synthesis. 1
A Theory of Networks for Approximation and Learning
 Laboratory, Massachusetts Institute of Technology
, 1989
"... Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, t ..."
Abstract

Cited by 195 (24 self)
 Add to MetaCart
Learning an inputoutput mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multidimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nonlinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. Wedevelop a theoretical framework for approximation based on regularization techniques that leads to a class of threelayer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the wellknown Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods suchasParzen windows and potential functions and to several neural network algorithms, suchas Kanerva's associative memory,backpropagation and Kohonen's topology preserving map. They also haveaninteresting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.
A multiscale retinex for bridging the gap between color images and the human observation of scenes
 IEEE Transactions on Image Processing
, 1997
"... Abstract — Direct observation and recorded color images of the same scenes are often strikingly different because human visual perception computes the conscious representation with vivid color and detail in shadows, and with resistance to spectral shifts in the scene illuminant. A computation for co ..."
Abstract

Cited by 138 (9 self)
 Add to MetaCart
Abstract — Direct observation and recorded color images of the same scenes are often strikingly different because human visual perception computes the conscious representation with vivid color and detail in shadows, and with resistance to spectral shifts in the scene illuminant. A computation for color images that approaches fidelity to scene observation must combine dynamic range compression, color consistency—a computational analog for human vision color constancy—and color and lightness tonal rendition. In this paper, we extend a previously designed singlescale center/surround retinex to a multiscale version that achieves simultaneous dynamic range compression/color consistency/lightness rendition. This extension fails to produce good color rendition for a class of images that contain violations of the grayworld assumption implicit to the theoretical foundation of the retinex. Therefore, we define a method of color restoration that corrects for this deficiency at the cost of a modest dilution in color consistency. Extensive testing of the multiscale retinex with color restoration on several test scenes and over a hundred images did not reveal any pathological behavior. I.
Synthesis of Novel Views From a Single Face Image
, 1996
"... Images formed by a human face change with viewpoint. A new technique is described for synthesizing images of faces from new viewpoints, when only a single 2D image is available. A novel 2D image of a face can be computed without knowledge about the 3D structure of the head. The technique draws on a ..."
Abstract

Cited by 52 (5 self)
 Add to MetaCart
Images formed by a human face change with viewpoint. A new technique is described for synthesizing images of faces from new viewpoints, when only a single 2D image is available. A novel 2D image of a face can be computed without knowledge about the 3D structure of the head. The technique draws on a single generic 3D model of a human head and on prior knowledge of faces based on example images of other faces seen in different poses. The example images are used to "learn" a poseinvariant shape and texture description of a new face. The 3D model is used to solve the correspondence problem between images showing faces in different poses. Examples of synthetic "rotations" over 24 ffi based on a training set of 100 faces are shown. This document is available as /pub/mpimemos/TR026.ps.Z via anonymous ftp from ftp.mpiktueb.mpg.de or from the World Wide Web, http://www.mpiktueb.mpg.de/projects/TechReport/list.html. 1 Introduction Given only a driver's license photograph of a person's ...
Incorporating Prior Information in Machine Learning by Creating Virtual Examples
 Proceedings of the IEEE
, 1998
"... One of the key problems in supervised learning is the insufficient size of the training set. The natural way for an intelligent learner to counter this problem and successfully generalize is to exploit prior information that may be available about the domain or that can be learned from prototypical ..."
Abstract

Cited by 42 (2 self)
 Add to MetaCart
One of the key problems in supervised learning is the insufficient size of the training set. The natural way for an intelligent learner to counter this problem and successfully generalize is to exploit prior information that may be available about the domain or that can be learned from prototypical examples. We discuss the notion of using prior knowledge by creating virtual examples and thereby expanding the effective training set size. We show that in some contexts, this idea is mathematically equivalent to incorporating the prior knowledge as a regularizer, suggesting that the strategy is wellmotivated. The process of creating virtual examples in real world pattern recognition tasks is highly nontrivial. We provide demonstrative examples from object recognition and speech recognition to illustrate the idea. 1 Learning from Examples Recently, machine learning techniques have become increasingly popular as an alternative to knowledgebased approaches to artificial intelligence pro...
Learning to estimate scenes from images
 Adv. Neural Information Processing Systems 11
, 1999
"... We seek the scene interpretation that best explains image data. ..."
Abstract

Cited by 37 (6 self)
 Add to MetaCart
We seek the scene interpretation that best explains image data.
Retinex processing for automatic image enhancement
 Journal of Electronic Imaging
, 2004
"... In the last published concept (1986) for a Retinex computation, Edwin Land introduced a center/surround spatial form, which was inspired by the receptive field structures of neurophysiology. With this as our starting point we have over the years developed this concept into a full scale automatic ima ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
In the last published concept (1986) for a Retinex computation, Edwin Land introduced a center/surround spatial form, which was inspired by the receptive field structures of neurophysiology. With this as our starting point we have over the years developed this concept into a full scale automatic image enhancement algorithm— the MultiScale Retinex with Color Restoration (MSRCR) which combines color constancy with local contrast/lightness enhancement to transform digital images into renditions that approach the realism of direct scene observation. The MSRCR algorithm has proven to be quite general purpose, and very resilient to common forms of image preprocessing such as reasonable ranges of gamma and contrast stretch transformations. More recently we have been exploring the fundamental scientific implications of this form of image processing, namely: (i) the visual inadequacy of the linear representation of digital images, (ii) the existence of a canonical or statistical ideal visual image, and (iii) new measures of visual quality based upon these insights derived from our extensive experience with MSRCR enhanced images. The lattermost serves as the basis for future schemes for automating visual assessment—a primitive first step in bringing visual intelligence to computers. 1.
A Multiscale Retinex For Color Rendition and Dynamic Range Compression
, 1996
"... The human vision system performs the tasks of dynamic range compression and color constancy almost effortlessly. The same tasks pose a very challenging problem for imaging systems whose dynamic range is restricted by either the dynamic response of film, in case of analog cameras, or by the analogto ..."
Abstract

Cited by 23 (3 self)
 Add to MetaCart
The human vision system performs the tasks of dynamic range compression and color constancy almost effortlessly. The same tasks pose a very challenging problem for imaging systems whose dynamic range is restricted by either the dynamic response of film, in case of analog cameras, or by the analogtodigital converters, in the case of digital cameras. The images thus formed are unable to encompass the wide dynamic range present in most natural scenes (often ? 500:1). Whereas the human visual system is quite tolerant to spectral changes in lighting conditions, these strongly affect both the film response for analog cameras and the filter responses for digital cameras, leading to incorrect color formulation in the acquired image. Our multiscale retinex, based in part on Edwin Land's work on color constancy, provides a fast, simple, and automatic technique for simultaneous dynamic range compression and accurate color rendition. The retinex algorithm is nonlinear, and globaloutput at a ...