Results 1 -
6 of
6
Issues in Vision Modeling for Perceptual Video Quality Assessment
, 1999
"... Lossy compression algorithms used in digital video systems produce artifacts whose visibility strongly depends on the actual image content. Simple error measures such as RMSE or PSNR, albeit popular, ignore this important fact and are only a mediocre predictor of perceived quality. Many applications ..."
Abstract
-
Cited by 47 (10 self)
- Add to MetaCart
Lossy compression algorithms used in digital video systems produce artifacts whose visibility strongly depends on the actual image content. Simple error measures such as RMSE or PSNR, albeit popular, ignore this important fact and are only a mediocre predictor of perceived quality. Many applications require more reliable assessment methods. This paper discusses issues in vision modeling for perceptual video quality assessment (PVQA). Its purpose is not to describe a particular model or system, but rather to summarize and to provide pointers to up-to-date knowledge of important characteristics of the human visual system, to explain how these characteristics may be incorporated in vision models for PVQA, to give a brief overview of the state-of-the-art and current efforts in this field, and to outline directions for future research.
Acuity-matching resolution degradation through wavelet coefficient scaling
- IEEE Transactions on Image Processing 9
, 2000
"... A wavelet-based multiresolution image representation method is developed matching Human Visual System (HVS) spatial acuity within multiple Regions Of Interest (ROIs). ROIs are maintained at high (original) resolution while peripheral areas are gracefully degraded. Variable resolution images are gene ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
A wavelet-based multiresolution image representation method is developed matching Human Visual System (HVS) spatial acuity within multiple Regions Of Interest (ROIs). ROIs are maintained at high (original) resolution while peripheral areas are gracefully degraded. Variable resolution images are generated by selectively scaling wavelet (detail) coefficients prior to reconstruction. The technique is equivalent to linear interpolation MIP-mapping which involves smooth subsampling (decomposition) prior to texture mapping (reconstruction). Multiple ROI degradation is achieved through wavelet coefficient scaling following Voronoi partitioning of the image plane.
Segmentation-driven perceptual quality metrics
- in Proc. ICIP
, 2004
"... We present a full-reference and a no-reference perceptual video quality metric that incorporate both low-level and high-level aspects of vision. Low-level aspects include color perception, contrast sensitivity, masking as well as artifact analysis. High-level aspects take into account the cognitive ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
We present a full-reference and a no-reference perceptual video quality metric that incorporate both low-level and high-level aspects of vision. Low-level aspects include color perception, contrast sensitivity, masking as well as artifact analysis. High-level aspects take into account the cognitive behavior of an observer when watching a video by means of semantic segmentation. Using the special case of semantic face segmentation, we evaluate the proposed segmentationdriven perceptual quality metrics using a range of test sequences and demonstrate an improvement of their prediction performance. 1.
A Hybrid Scheme for Perceptual Object Window Design with Joint Scene Analysis and Eye-Gaze Tracking for Media Encoding based on Perceptual Attention
- In Journal of Electronic Imaging 15(02
, 2006
"... The possibility of perceptual compression using live eye-tracking has been anticipated for some time by many researchers. Among the challenges of real-time eye-gaze based perceptual video compression is how to handle the fast nature of eye movements with a relative complexity of video transcoding an ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
The possibility of perceptual compression using live eye-tracking has been anticipated for some time by many researchers. Among the challenges of real-time eye-gaze based perceptual video compression is how to handle the fast nature of eye movements with a relative complexity of video transcoding and also take into the account a delay associated with transmission in the network. Such delay requires an additional consideration in perceptual encoding because it increases the size of the area that requires high quality coding. In this paper we present a hybrid scheme, one of the first to our knowledge, which combines eye-tracking with fast in-line scene analysis to drastically narrow down the high acuity area without the loss of eye-gaze containment. Keywords: eye-gaze, perceptual encoding, MPEG-2. 1.
Perceptual Attention Focus Prediction for Multiple Viewers in Case of Multimedia Perceptual Compression with Feedback Delay
- In Proceedings of the symposium on Eye tracking research & applications 2006 (ETRA 06
, 2006
"... Human eyes have limited perception capabilities. Only 2 degrees of our 180 degree vision field provide the highest quality of perception. Due to this fact the idea of perceptual attention focus emerged to allow a visual content to be changed in a way that only part of the visual field where a human ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
Human eyes have limited perception capabilities. Only 2 degrees of our 180 degree vision field provide the highest quality of perception. Due to this fact the idea of perceptual attention focus emerged to allow a visual content to be changed in a way that only part of the visual field where a human attention is directed to is encoded with a high quality. The image quality in the periphery can be reduced without a viewer noticing it. This compression approach allows a significant decrease in bit-rate for a video stream, and in the case of the 3D stream rendering, it decreases the computational burden. A number of previous researchers have investigated the topic of real-time perceptual attention focus but only for a single viewer. In this paper we investigate a dynamically changing multi-viewer scenario. In this type of scenario a number of people are watching the same visual content at the same time. Each person is using eye-tracking equipment. The visual content (video, 3D stream) is sent through a network with a large transmission delay. The area of the perceptual attention focus is predicted for the viewers to compensate for the delay value and identify the area of the image which requires highest quality coding.
ARTICLE IN PRESS Computers in Biology and Medicine ( ) –
"... www.intl.elsevierhealth.com/journals/cobm Where people look when watching movies: Do all viewers look at the same place? ..."
Abstract
- Add to MetaCart
www.intl.elsevierhealth.com/journals/cobm Where people look when watching movies: Do all viewers look at the same place?

