Results 1  10
of
20
Removing camera shake from a single photograph
 ACM Trans. Graph
, 2006
"... Camera shake during exposure leads to objectionable image blur and ruins many photographs. Conventional blind deconvolution methods typically assume frequencydomain constraints on images, or overly simplified parametric forms for the motion path during camera shake. Real camera motions can follow c ..."
Abstract

Cited by 188 (13 self)
 Add to MetaCart
Camera shake during exposure leads to objectionable image blur and ruins many photographs. Conventional blind deconvolution methods typically assume frequencydomain constraints on images, or overly simplified parametric forms for the motion path during camera shake. Real camera motions can follow convoluted paths, and a spatial domain prior can better maintain visually salient image characteristics. We introduce a method to remove the effects of camera shake from seriously blurred images. The method assumes a uniform camera blur over the image and negligible inplane camera rotation. In order to estimate the blur from the camera shake, the user must specify an image region without saturation effects. We show results for a variety of digital photographs taken from personal photo collections.
Efficient learning of sparse representations with an energybased model
 ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS (NIPS 2006
, 2006
"... We describe a novel unsupervised method for learning sparse, overcomplete features. The model uses a linear encoder, and a linear decoder preceded by a sparsifying nonlinearity that turns a code vector into a quasibinary sparse code vector. Given an input, the optimal code minimizes the distance b ..."
Abstract

Cited by 116 (14 self)
 Add to MetaCart
We describe a novel unsupervised method for learning sparse, overcomplete features. The model uses a linear encoder, and a linear decoder preceded by a sparsifying nonlinearity that turns a code vector into a quasibinary sparse code vector. Given an input, the optimal code minimizes the distance between the output of the decoder and the input patch while being as similar as possible to the encoder output. Learning proceeds in a twophase EMlike fashion: (1) compute the minimumenergy code vector, (2) adjust the parameters of the encoder and decoder so as to decrease the energy. The model produces “stroke detectors ” when trained on handwritten numerals, and Gaborlike filters when trained on natural image patches. Inference and learning are very fast, requiring no preprocessing, and no expensive sampling. Using the proposed unsupervised method to initialize the first layer of a convolutional network, we achieved an error rate slightly lower than the best reported result on the MNIST dataset. Finally, an extension of the method is described to learn topographical filter maps. 1
Blind motion deblurring using image statistics
 In Advances in Neural Information Processing Systems (NIPS
"... We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering ..."
Abstract

Cited by 42 (3 self)
 Add to MetaCart
We address the problem of blind motion deblurring from a single image, caused by a few moving objects. In such situations only part of the image may be blurred, and the scene consists of layers blurred in different degrees. Most of of existing blind deconvolution research concentrates at recovering a single blurring kernel for the entire image. However, in the case of different motions, the blur cannot be modeled with a single kernel, and trying to deconvolve the entire image with the same kernel will cause serious artifacts. Thus, the task of deblurring needs to involve segmentation of the image into regions with different blurs. Our approach relies on the observation that the statistics of derivative filters in images are significantly changed by blur. Assuming the blur results from a constant velocity motion, we can limit the search to one dimensional box filter blurs. This enables us to model the expected derivatives distributions as a function of the width of the blur kernel. Those distributions are surprisingly powerful in discriminating regions with different blurs. The approach produces convincing deconvolution results on real world images with rich texture. 1
Steganalysis using higherorder image statistics
 IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
, 2006
"... Techniques for information hiding (steganography) are becoming increasingly more sophisticated and widespread. With highresolution digital images as carriers, detecting hidden messages is also becoming considerably more difficult. We describe a universal approach to steganalysis for detecting the p ..."
Abstract

Cited by 38 (2 self)
 Add to MetaCart
Techniques for information hiding (steganography) are becoming increasingly more sophisticated and widespread. With highresolution digital images as carriers, detecting hidden messages is also becoming considerably more difficult. We describe a universal approach to steganalysis for detecting the presence of hidden messages embedded within digital images. We show that, within multiscale, multiorientation image decompositions (e.g., wavelets), first and higherorder magnitude and phase statistics are relatively consistent across a broad range of images, but are disturbed by the presence of embedded hidden messages. We show the efficacy of our approach on a large collection of images, and on eight different steganographic embedding algorithms.
Multidimensional infinitely divisible cascades. application to the modelling of intermittency in turbulence
 European Physical J. B
, 2005
"... Abstract—We propose to model the statistics of natural images, thanks to the large class of stochastic processes called Infinitely Divisible Cascades (IDCs). IDCs were first introduced in one dimension to provide multifractal time series to model the socalled intermittency phenomenon in hydrodynami ..."
Abstract

Cited by 21 (4 self)
 Add to MetaCart
Abstract—We propose to model the statistics of natural images, thanks to the large class of stochastic processes called Infinitely Divisible Cascades (IDCs). IDCs were first introduced in one dimension to provide multifractal time series to model the socalled intermittency phenomenon in hydrodynamical turbulence. We have extended the definition of scalar IDCs from one to N dimensions and commented on the relevance of such a model in fully developed turbulence in [1]. In this paper, we focus on the particular 2D case. IDCs appear as good candidates to model the statistics of natural images. They share most of their usual properties and appear to be consistent with several independent theoretical and experimental approaches of the literature. We point out the interest of IDCs for applications to procedural texture synthesis. Index Terms—Stochastic processes, picture/image generation, fractals, image processing and computer vision, statistical, image models. 1
Modelbased decoding, information estimation, and changepoint detection in multineuron spike trains
 UNDER REVIEW, NEURAL COMPUTATION
, 2007
"... Understanding how stimulus information is encoded in spike trains is a central problem in computational neuroscience. Decoding methods provide an important tool for addressing this problem, by allowing us to explicitly read out the information contained in spike responses. Here we introduce several ..."
Abstract

Cited by 19 (12 self)
 Add to MetaCart
Understanding how stimulus information is encoded in spike trains is a central problem in computational neuroscience. Decoding methods provide an important tool for addressing this problem, by allowing us to explicitly read out the information contained in spike responses. Here we introduce several decoding methods based on pointprocess neural encoding models (i.e. “forward ” models that predict spike responses to novel stimuli). These models have concave loglikelihood functions, allowing for efficient fitting via maximum likelihood. Moreover, we may use the likelihood of the observed spike trains under the model to perform optimal decoding. We present: (1) a tractable algorithm for computing the maximum a posteriori (MAP) estimate of the stimulus — the most probable stimulus to have generated the observed single or multiplespike train response, given some prior distribution over the stimulus; (2) a Gaussian approximation to the posterior distribution, which allows us to quantify the fidelity with which various stimulus features are encoded; (3) an efficient method for estimating the mutual information between the stimulus and the response; and (4) a framework for the detection of changepoint times (e.g. the time at which the stimulus undergoes a change in mean or variance), by marginalizing over the posterior distribution of stimuli. We show several examples illustrating the performance of these estimators with simulated data.
Motion Tuned Spatiotemporal Quality Assessment of Natural Videos
 IEEE TRANSACTIONS ON IMAGE PROCESSING
, 2010
"... There has recently been a great deal of interest in the development of algorithms that objectively measure the integrity of video signals. Since video signals are being delivered to human end users in an increasingly wide array of applications and products, it is important that automatic methods of ..."
Abstract

Cited by 19 (3 self)
 Add to MetaCart
There has recently been a great deal of interest in the development of algorithms that objectively measure the integrity of video signals. Since video signals are being delivered to human end users in an increasingly wide array of applications and products, it is important that automatic methods of video quality assessment (VQA) be available that can assist in controlling the quality of video being delivered to this critical audience. Naturally, the quality of motion representation in videos plays an important role in the perception of video quality, yet existing VQA algorithms make little direct use of motion information, thus limiting their effectiveness. We seek to ameliorate this by developing a general, spatiospectrally localized multiscale framework for evaluating dynamic video fidelity that integrates both spatial and temporal (and spatiotemporal) aspects of distortion assessment. Video quality is evaluated not only in space and time, but also in spacetime, by evaluating motion quality along computed motion trajectories. Using this framework, we develop a full reference VQA algorithm for which we coin the term the MOtionbased Video Integrity Evaluation index, or MOVIE index. It is found that the MOVIE index delivers VQA scores that correlate quite closely with human subjective judgment, using the Video Quality Expert Group (VQEG) FRTV Phase 1 database as a test bed. Indeed, the MOVIE index is found to be quite competitive with, and even outperform, algorithms developed and submitted to the VQEG FRTV Phase 1 study, as well as more recent VQA algorithms tested on this database.
On Deep Generative Models with Applications to Recognition
"... The most popular way to use probabilistic models in vision is first to extract some descriptors of small image patches or object parts using wellengineered features, and then to use statistical learning tools to model the dependencies among these features and eventual labels. Learning probabilistic ..."
Abstract

Cited by 11 (2 self)
 Add to MetaCart
The most popular way to use probabilistic models in vision is first to extract some descriptors of small image patches or object parts using wellengineered features, and then to use statistical learning tools to model the dependencies among these features and eventual labels. Learning probabilistic models directly on the raw pixel values has proved to be much more difficult and is typically only used for regularizing discriminative methods. In this work, we use one of the best, pixellevel, generative models of natural images – a gated MRF – as the lowest level of a deep belief network (DBN) that has several hidden layers. We show that the resulting DBN is very good at coping with occlusion when predicting expression categories from face images, and it can produce features that perform comparably to SIFT descriptors for discriminating different types of scene. The generative ability of the model also makes it easy to see what information is captured and what is lost at each level of representation. 1. Introduction and Previous
Rate distortion behavior of sparse sources
 in Technical report, EPFL
, 2001
"... Abstract—The rate distortion behavior of sparse memoryless sources is studied. Such sources serve as models for sparse representations and can be used for the performance analysis of “sparsifying ” transforms like the wavelet transform, as well as nonlinear approximation schemes. Under the Hamming d ..."
Abstract

Cited by 5 (0 self)
 Add to MetaCart
Abstract—The rate distortion behavior of sparse memoryless sources is studied. Such sources serve as models for sparse representations and can be used for the performance analysis of “sparsifying ” transforms like the wavelet transform, as well as nonlinear approximation schemes. Under the Hamming distortion criterion, R(D) is shown to be almost linear for sources emitting sparse binary vectors. For continuous random variables, the geometric mean is proposed as a sparsity measure and shown to lead to upper and lower bounds on the entropy, thereby characterizing asymptotic R(D) behavior. Three models are analyzed more closely under the mean squared error distortion measure: continuous spikes in random discrete locations, power laws matching the approximately scaleinvariant decay of wavelet coefficients, and Gaussian mixtures. The latter are versatile models for sparse data, which in particular allow to bound the suitably defined coding gain of a scalar mixture compared to that of a corresponding unmixed transform coding system. Such a comparison is interesting for transforms with known coefficient decay, but unknown coefficient ordering, e.g. when the positions of highestvariance coefficients are unknown. The use of these models and results in distributed coding and compressed sensing scenarios is also discussed. Index Terms—Sparse signal representations, rate distortion theory, memoryless systems, entropy, transform coding. I.
How to generate realistic images using gated MRF’s
 In Advances in Neural Information Processing Systems
"... Probabilistic models of natural images are usually evaluated by measuring performance on rather indirect tasks, such as denoising and inpainting. A better way to evaluate a generative model is to draw samples from it and to check whether statistical properties of the samples match the statistics of ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Probabilistic models of natural images are usually evaluated by measuring performance on rather indirect tasks, such as denoising and inpainting. A better way to evaluate a generative model is to draw samples from it and to check whether statistical properties of the samples match the statistics of natural images. This method is seldom used with highresolution images, because current models produce samples that are very different from natural images, as assessed by even simple visual inspection. We investigate the reasons for this failure and we show that by augmenting existing models so that there are two sets of latent variables, one set modelling pixel intensities and the other set modelling imagespecific pixel covariances, we are able to generate highresolution images that look much more realistic than before. The overall model can be interpreted as a gated MRF where both pairwise dependencies and mean intensities of pixels are modulated by the states of latent variables. Finally, we confirm that if we disallow weightsharing between receptive fields that overlap each other, the gated MRF learns more efficient