Results 1  10
of
225
80 million tiny images: a large dataset for nonparametric object and scene recognition
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... ..."
(Show Context)
Image Segmentation by Data Driven Markov Chain Monte Carlo
, 2001
"... This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to exp ..."
Abstract

Cited by 281 (32 self)
 Add to MetaCart
This paper presents a computational paradigm called Data Driven Markov Chain Monte Carlo (DDMCMC) for image segmentation in the Bayesian statistical framework. The paper contributes to image segmentation in three aspects. Firstly, it designs effective and well balanced Markov Chain dynamics to explore the solution space and makes the split and merge process reversible at a middle level vision formulation. Thus it achieves globally optimal solution independent of initial segmentations. Secondly, instead of computing a single maximum a posteriori solution, it proposes a mathematical principle for computing multiple distinct solutions to incorporates intrinsic ambiguities in image segmentation. A kadventurers algorithm is proposed for extracting distinct multiple solutions from the Markov chain sequence. Thirdly, it utilizes datadriven (bottomup) techniques, such as clustering and edge detection, to compute importance proposal probabilities, which eectively drive the Markov chain dynamics and achieve tremendous speedup in comparison to traditional jumpdiffusion method[4]. Thus DDMCMC paradigm provides a unifying framework where the role of existing segmentation algorithms, such as, edge detection, clustering, region growing, splitmerge, SNAKEs, region competition, are revealed as either realizing Markov chain dynamics or computing importance proposal probabilities. We report some results on color and grey level image segmentation in this paper and refer to a detailed report and a web site for extensive discussion.
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking Hedvig Sidenblen
 In European Conference on Computer Vision
, 2002
"... This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthe ..."
Abstract

Cited by 204 (4 self)
 Add to MetaCart
(Show Context)
This paper addresses the problem of probabilistically modeling 3D human motion for synthesis and tracking. Given the high dimensional nature of human motion, learning an explicit probabilistic model from available training data is currently impractical. Instead we exploit methods from texture synthesis that treat images as representing an implicit empirical distribution . These methods replace the problem of representing the probability of a texture pattern with that of searching the training data for similar instances of that pattern. We extend this idea to temporal data representing 3D human motion with a large database of example motions. To make the method useful in practice, we must address the problem of efficient search in a large training set
Realtime texture synthesis by patchbased sampling
 ACM Transactions on Graphics
, 2001
"... We present a patchbased sampling algorithm for synthesizing textures from an input sample texture. The patchbased sampling algorithm is fast. Using patches of the sample texture as building blocks for texture synthesis, this algorithm makes highquality texture synthesis a realtime process. For g ..."
Abstract

Cited by 170 (12 self)
 Add to MetaCart
(Show Context)
We present a patchbased sampling algorithm for synthesizing textures from an input sample texture. The patchbased sampling algorithm is fast. Using patches of the sample texture as building blocks for texture synthesis, this algorithm makes highquality texture synthesis a realtime process. For generating textures of the same size and comparable (or better) quality, patchbased sampling is orders of magnitude faster than existing texture synthesis algorithms. The patchbased sampling algorithm synthesizes highquality textures for a wide variety of textures ranging from regular to stochastic. By sampling patches according to a nonparametric estimation of the local conditional MRF density, we avoid mismatching features across patch boundaries. Moreover, the patchbased sampling algorithm remains effective when pixelbased nonparametric sampling algorithms fail to produce good results. For natural textures, the results of the patchbased sampling look subjectively better.
Prior Learning and Gibbs ReactionDiffusion
, 1997
"... This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designi ..."
Abstract

Cited by 170 (17 self)
 Add to MetaCart
This article addresses two important themes in early visual computation: rst it presents a novel theory for learning the universal statistics of natural images { a prior model for typical cluttered scenes of the world { from a set of natural images, second it proposes a general framework of designing reactiondiusion equations for image processing. We start by studying the statistics of natural images including the scale invariant properties, then generic prior models were learned to duplicate the observed statistics, based on the minimax entropy theory studied in two previous papers. The resulting Gibbs distributions have potentials of the form U(I; ; S) = P K I)(x; y)) with S = fF g being a set of lters and = f the potential functions. The learned Gibbs distributions con rm and improve the form of existing prior models such as lineprocess, but in contrast to all previous models, inverted potentials (i.e. (x) decreasing as a function of jxj) were found to be necessary. We nd that the partial dierential equations given by gradient descent on U(I; ; S) are essentially reactiondiusion equations, where the usual energy terms produce anisotropic diusion while the inverted energy terms produce reaction associated with pattern formation, enhancing preferred image features. We illustrate how these models can be used for texture pattern rendering, denoising, image enhancement and clutter removal by careful choice of both prior and data models of this type, incorporating the appropriate features. Song Chun Zhu is now with the Computer Science Department, Stanford University, Stanford, CA 94305, and David Mumford is with the Division of Applied Mathematics, Brown University, Providence, RI 02912. This work started when the authors were at ...
Probabilistic framework for the adaptation and comparison of image codes
 J. OPT. SOC. AM. A
, 1999
"... We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing ..."
Abstract

Cited by 144 (12 self)
 Add to MetaCart
(Show Context)
We apply a Bayesian method for inferring an optimal basis to the problem of finding efficient image codes for natural scenes. The basis functions learned by the algorithm are oriented and localized in both space and frequency, bearing a resemblance to twodimensional Gabor functions, and increasing the number of basis functions results in a greater sampling density in position, orientation, and scale. These properties also resemble the spatial receptive fields of neurons in the primary visual cortex of mammals, suggesting that the receptivefield structure of these neurons can be accounted for by a general efficient coding principle. The probabilistic framework provides a method for comparing the coding efficiency of different bases objectively by calculating their probability given the observed data or by measuring the entropy of the basis function coefficients. The learned bases are shown to have better coding efficiency than traditional Fourier and wavelet bases. This framework also provides a Bayesian solution to the problems of image denoising and filling in of missing pixels. We demonstrate that the results obtained by applying the learned bases to these problems are improved over those obtained with traditional techniques.
Learning to combine bottomup and topdown segmentation
 in: European Conference on Computer Vision
"... Abstract. Bottomup segmentation based only on lowlevel cues is a notoriously difficult problem. This difficulty has lead to recent topdown segmentation algorithms that are based on classspecific image information. Despite the success of topdown algorithms, they often give coarse segmentations t ..."
Abstract

Cited by 131 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Bottomup segmentation based only on lowlevel cues is a notoriously difficult problem. This difficulty has lead to recent topdown segmentation algorithms that are based on classspecific image information. Despite the success of topdown algorithms, they often give coarse segmentations that can be significantly refined using lowlevel cues. This raises the question of how to combine both topdown and bottomup cues in a principled manner. In this paper we approach this problem using supervised learning. Given a training set of ground truth segmentations we train a fragmentbased segmentation algorithm which takes into account both bottomup and topdown cues simultaneously, in contrast to most existing algorithms which train topdown and bottomup modules separately. We formulate the problem in the framework of Conditional Random Fields (CRF) and derive a feature induction algorithm for CRF, which allows us to efficiently search over thousands of candidate fragments. Whereas pure topdown algorithms often require hundreds of fragments, our simultaneous learning procedure yields algorithms with a handful of fragments that are combined with lowlevel cues to efficiently compute high quality segmentations. 1
A Stochastic Grammar of Images
 Foundations and Trends in Computer Graphics and Vision
, 2006
"... This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents bot ..."
Abstract

Cited by 119 (26 self)
 Add to MetaCart
(Show Context)
This exploratory paper quests for a stochastic and context sensitive grammar of images. The grammar should achieve the following four objectives and thus serves as a unified framework of representation, learning, and recognition for a large number of object categories. (i) The grammar represents both the hierarchical decompositions from scenes, to objects, parts, primitives and pixels by terminal and nonterminal nodes and the contexts for spatial and functional relations by horizontal links between the nodes. It formulates each object category as the set of all possible valid configurations produced by the grammar. (ii) The grammar is embodied in a simple And–Or graph representation where each Ornode points to alternative subconfigurations and an Andnode is decomposed into a number of components. This representation supports recursive topdown/bottomup procedures for image parsing under the Bayesian framework and make it convenient to scale
Statistical edge detection: learning and evaluating edge cues
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2003
"... We formulate edge detection as statistical inference. This statistical edge detection is data driven, unlike standard methods for edge detection which are model based. For any set of edge detection filters (implementing local edge cues) we use presegmented images to learn the probability distributi ..."
Abstract

Cited by 89 (10 self)
 Add to MetaCart
We formulate edge detection as statistical inference. This statistical edge detection is data driven, unlike standard methods for edge detection which are model based. For any set of edge detection filters (implementing local edge cues) we use presegmented images to learn the probability distributions of filter responses conditioned on whether they are evaluated on or off an edge. Edge detection is formulated as a discrimination task specified by a likelihood ratio test on the filter responses. This approach emphasizes the necessity of modeling the image background (the offedges). We represent the conditional probability distributions nonparametrically and learn them on two different datasets of 100 (Sowerby) and 50 (South Florida) images. Multiple edges cues, including chrominance and multiplescale, are combined by using their joint distributions. Hence this cue combination is optimal in the statistical sense. We evaluate the effectiveness of different visual cues using the Chernoff information and Receiver Operator Characteristic (ROC) curves. This shows that our approach gives quantitatively better results than the Canny edge detector when the image background contains significant clutter. In addition, it enables us to determine the effectiveness of different edge cues and gives quantitative measures for the advantages of multilevel processing, for the use of chrominance, and for the relative effectiveness of different detectors. Furthermore, we show that we can learn these conditional distributions on one dataset and adapt them to the other with only slight degradation of performance without knowing the ground truth on the second dataset. This shows that our results are not purely domain specific. We apply the same approach to the spatial grouping of edge cues and obtain analogies to nonmaximal suppression and hysteresis.
A twostep approach to hallucinating faces: global parametric model and local nonparametric model
 In CVPR
, 2001
"... In this paper, we study face hallucination or synthesizing a highresolution face image from a lowresolution input, with the help of a large collection of other highresolution face images. We develop a twostep statistical modeling approach that integrates both a global parametric model and a local ..."
Abstract

Cited by 87 (4 self)
 Add to MetaCart
(Show Context)
In this paper, we study face hallucination or synthesizing a highresolution face image from a lowresolution input, with the help of a large collection of other highresolution face images. We develop a twostep statistical modeling approach that integrates both a global parametric model and a local nonparametric model. First, we derive a global linear model to learn the relationship between the highresolution face images and their smoothed and downsampled lower resolution ones. Second, the residual between an original highresolution image and the reconstructed highresolution image by learned linear model is modeled by a patchbased nonparametric Markov network, to capture the highfrequency content of faces. By integrating both global and local models, we can generate photorealistic face images. Our approach is demonstrated by extensive experiments with highquality hallucinated faces. 1.