Results 1  10
of
12
Blind image quality assessment: A natural scene statatistics approach in the DCT domain
 IEEE Trans. on Image Processing
"... Abstract — We develop an efficient generalpurpose blind/ noreference image quality assessment (IQA) algorithm using a natural scene statistics (NSS) model of discrete cosine transform (DCT) coefficients. The algorithm is computationally appealing, given the availability of platforms optimized for ..."
Abstract

Cited by 45 (13 self)
 Add to MetaCart
(Show Context)
Abstract — We develop an efficient generalpurpose blind/ noreference image quality assessment (IQA) algorithm using a natural scene statistics (NSS) model of discrete cosine transform (DCT) coefficients. The algorithm is computationally appealing, given the availability of platforms optimized for DCT computation. The approach relies on a simple Bayesian inference model to predict image quality scores given certain extracted features. The features are based on an NSS model of the image DCT coefficients. The estimated parameters of the model are utilized to form features that are indicative of perceptual quality. These features are used in a simple Bayesian inference approach to predict quality scores. The resulting algorithm, which we name BLIINDSII, requires minimal training and adopts a simple probabilistic model for score prediction. Given the extracted features from a test image, the quality score that maximizes the probability of the empirically determined inference model is chosen as the predicted quality score of that image. When tested on the LIVE IQA database, BLIINDSII is shown to correlate highly with human judgments of quality, at a level that is competitive with the popular SSIM index. Index Terms — Discrete cosine transform (DCT), generalized Gaussian density, natural scene statistics, noreference image quality assessment. I.
On the mathematical properties of the structural similarity index
 IEEE Trans. Image Process
, 2012
"... Abstract—Since its introduction in 2004, the structural similarity (SSIM) index has gained widespread popularity as a tool to assess the quality of images and to evaluate the performance of image processing algorithms and systems. There has been also a growing interest of using SSIM as an objective ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
Abstract—Since its introduction in 2004, the structural similarity (SSIM) index has gained widespread popularity as a tool to assess the quality of images and to evaluate the performance of image processing algorithms and systems. There has been also a growing interest of using SSIM as an objective function in optimization problems in a variety of image processing applications. One major issue that could strongly impede the progress of such efforts is the lack of understanding of the mathematical properties of the SSIM measure. For example, some highly desirable properties such as convexity and triangular inequality that are possessed by the mean squared error may not hold. In this paper, we first construct a series of normalized and generalized (vectorvalued) metrics based on the important ingredients of SSIM. We then show that such modified measures are valid distance metrics and have many useful properties, among which the most significant ones include quasiconvexity, a region of convexity around the minimizer, and distance preservation under orthogonal or unitary transformations. The groundwork laid here extends the potentials of SSIM in both theoretical development and practical applications.1 Index Terms—Cone metrics, normalized metrics, perceptually optimized algorithms and methods, quality metrics and assessment tools, quasiconvexity and convexity, structural similarity (SSIM) index. I.
Psychophysically tuned divisive normalization approximately factorizes the
 PDF of natural images,” Neural Computation
, 2010
"... Abstract. The conventional approach in Computational Neuroscience in favor of the efficient encoding hypothesis goes from image statistics to perception. It has been argued that the behavior of the early stages of biological visual processing (e.g. spatial frequency analyzers and their nonlineariti ..."
Abstract

Cited by 7 (2 self)
 Add to MetaCart
Abstract. The conventional approach in Computational Neuroscience in favor of the efficient encoding hypothesis goes from image statistics to perception. It has been argued that the behavior of the early stages of biological visual processing (e.g. spatial frequency analyzers and their nonlinearities) may be obtained from image samples and the efficient encoding hypothesis using no psychophysical or physiological information. In this work we address the same issue in the opposite direction, from perception to image statistics: we show that psychophysically fitted image representation in V1 has appealing statistical properties, e.g. approximate PDF factorization and substantial mutual information reduction, even though no statistical information is used to fit the V1 model. These results are a complementary evidence in favor of the efficient encoding hypothesis. 1
Is image quality a function of contrast perception?
"... In this retrospective we trace in broad strokes the development of image quality measures based on the study of the early stages of the human visual system (HVS), where contrast encoding is fundamental. We find that while presenters at the Human Vision and Electronic Imaging meetings have frequently ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this retrospective we trace in broad strokes the development of image quality measures based on the study of the early stages of the human visual system (HVS), where contrast encoding is fundamental. We find that while presenters at the Human Vision and Electronic Imaging meetings have frequently strived to find points of contact between the study of human contrast psychophysics and the development of computer vision and image quality algorithms. Progress has not always been made on these terms, although indirect impact of vision science on more recent image quality metrics can be observed. 1. CONTRAST PERCEPTION Human spatial vision is complicated. We know certain facts about it: thresholds vary with spatiotemporal frequency and eccentricity, there is masking and adaptation etc., and all these are mediated by bandpass channels that analyze the image along different dimensions. It was thought and may still be held by many that by understanding the psychophysical properties of contrast perception, and by modeling these more and more precisely, measures of image quality could be designed. Results of new algorithms for encoding or compression or transmission or display of electronic images could be presented to a simulated human visual system that would then return a verdict on image quality consistent with human quality judgments.
FMAD: A FeatureBased Extension of the Most Apparent Distortion Algorithm for Image Quality Assessment
"... In this paper, we describe the results of a study designed to investigate the effectiveness of peak signaltonoise ratio (PSNR) as a quality estimator when measured in various feature domains. Although PSNR is well known to be a poor predictor of image quality, PSNR has been shown be quite effectiv ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
In this paper, we describe the results of a study designed to investigate the effectiveness of peak signaltonoise ratio (PSNR) as a quality estimator when measured in various feature domains. Although PSNR is well known to be a poor predictor of image quality, PSNR has been shown be quite effective for additive, pixelbased distortions. We hypothesized that PSNR might also be effective for other types of distortions which induce changes to other visual features, as long as PSNR is measured between local measures of such features. Given a reference and distorted image, five feature maps are measured for each image (lightness distance, color distance, contrast, edge strength, and sharpness). We describe a variant of PSNR in which quality is estimated based on the extent to which these feature maps for the reference image differ from the corresponding maps for the distorted image. We demonstrate how this featurebased approach can lead to improved estimators of image quality. 1.
Blind Image and Video Quality Assessment Using Natural Scene and Motion Models
, 2013
"... Copyright by ..."
(Show Context)
Geometrical and Statistical Properties of Vision Models obtained via Maximum Differentiation
"... We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such mo ..."
Abstract
 Add to MetaCart
We examine properties of perceptual image distortion models, computed as the mean squared error in the response of a 2stage cascaded image transformation. Each stage in the cascade is composed of a linear transformation, followed by a local nonlinear normalization operation. We consider two such models. For the first, the structure of the linear transformations is chosen according to perceptual criteria: a centersurround filter that extracts local contrast, and a filter designed to select visually relevant contrast according to the Standard Spatial Observer. For the second, the linear transformations are chosen based on statistical criterion, so as to eliminate correlations estimated from responses to a set of natural images. For both models, the parameters that govern the scale of the linear filters and the properties of the nonlinear normalization operation, are chosen to achieve minimal/maximal subjective discriminability of pairs of images that have been optimized to minimize/maximize the model, respectively (we refer to this as MAximum Differentiation, or “MAD”, Optimization). We find that both representations substantially reduce redundancy (mutual information), with a larger reduction occurring in the second (statistically optimized) model. We also find that both models are highly correlated with subjective scores from the TID2008 database, with slightly better performance seen in the first (perceptually chosen) model. Finally, we use a foveated version of the perceptual model to synthesize visual metamers. Specifically, we generate an example of a distorted image that is optimized so as to minimize the perceptual error over receptive fields that scale with eccentricity, demonstrating that the errors are barely visible despite a substantial MSE relative to the original image.
Chromatic Induction and Contrast Masking: similar models, different goals?
"... Normalization of signals coming from linear sensors is an ubiquitous mechanism of neural adaptation.1 Local interaction between sensors tuned to a particular feature at certain spatial position and neighbor sensors explains a wide range of psychophysical facts including (1) masking of spatial patter ..."
Abstract
 Add to MetaCart
Normalization of signals coming from linear sensors is an ubiquitous mechanism of neural adaptation.1 Local interaction between sensors tuned to a particular feature at certain spatial position and neighbor sensors explains a wide range of psychophysical facts including (1) masking of spatial patterns,2 (2) nonlinearities of motion sensors,3 (3) adaptation of color perception,4 (4) brightness and chromatic induction,5,6 and (5) image quality assessment.7 Although the above models have formal and qualitative similarities, it does not necessarily mean that the mechanisms involved are pursuing the same statistical goal. For instance, in the case of chromatic mechanisms (disregarding spatial information), different parameters in the normalization give rise to optimal discrimination or adaptation,8 and different nonlinearities may give rise to error minimization or component independence.9 In the case of spatial sensors (disregarding color information), a number of studies have pointed out the benefits of masking in statistical independence terms.10{13 However, such statistical analysis has not been performed for spatiochromatic induction models where chromatic perception depends on spatial configuration. In this work we investigate whether successful spatiochromatic induction models,6 increase component independence similarly as previously reported for masking models.12 Mutual information analysis suggests that seeking an efficient chromatic representation may explain the prevalence of induction effects in spatially simple images.
Visual discrimination and adaptation using nonlinear unsupervised learning
"... Understanding human vision not only involves empirical descriptions of how it works, but also organization principles that explain why it does so.1 Identifying the guiding principles of visual phenomena requires learning algorithms to optimize specific goals. Moreover, these algorithms have to be fl ..."
Abstract
 Add to MetaCart
Understanding human vision not only involves empirical descriptions of how it works, but also organization principles that explain why it does so.1 Identifying the guiding principles of visual phenomena requires learning algorithms to optimize specific goals. Moreover, these algorithms have to be flexible enough to account for the nonlinear and adaptive behavior of the system. For instance, linear redundancy reduction transforms certainly explain a wide range of visual phenomena.2–9 However, the generality of this organization principle is still in question:10 it is not only that and additional constraints such as energy cost may be relevant as well,11 but also, statistical independence may not be the better solution to make optimal inferences in squared error terms.12–14 Moreover, linear methods cannot account for the nonuniform discrimination in different regions of the image and color space: linear learning methods necessarily disregard the nonlinear nature of the system. Therefore, in order to account for the nonlinear behavior, principled approaches commonly apply the trick of using (already nonlinear) parametric expressions taken from empirical models.15–17 Therefore these approaches are not actually explaining the nonlinear behavior, but just fitting it to image statistics. In summary, a proper explanation of the behavior of the system requires flexible unsupervised learning algorithms that (1) are tunable to different, perceptually meaningful, goals; and (2) make
LETTER Communicated by Odelia Schwartz Dependency Reduction with Divisive Normalization: Justification and Effectiveness
"... Efficient coding transforms that reduce or remove statistical dependencies in natural sensory signals are important for both biology and engineering. In recent years, divisive normalization (DN) has been advocated as a simple and effective nonlinear efficient coding transform. In this work, we firs ..."
Abstract
 Add to MetaCart
Efficient coding transforms that reduce or remove statistical dependencies in natural sensory signals are important for both biology and engineering. In recent years, divisive normalization (DN) has been advocated as a simple and effective nonlinear efficient coding transform. In this work, we first elaborate on the theoretical justification for DN as an efficient coding transform. Specifically, we use the multivariate t model to represent several important statistical properties of natural sensory signals and show that DN approximates the optimal transforms that eliminate statistical dependencies in the multivariate t model. Second, we show that several forms of DN used in the literature are equivalent in their effects as efficient coding transforms. Third, we provide a quantitative evaluation of the overall dependency reduction performance of DN for both the multivariate t models and natural sensory signals. Finally, we find that statistical dependencies in the multivariate t model and natural sensory signals are increased by the DN transform with lowinput dimensions. This implies that for DN to be an effective efficient coding transform, it has to pool over a sufficiently large number of inputs. 1