Results 1 -
6 of
6
T.: Evaluation of the visual performance of image processing pipes: information value of subjective image attribute
- In: Proceedings of the SPIE
, 2010
"... ABSTRACT Subjective image quality data for 9 image processing pipes and 8 image contents (taken with mobile phone camera, 72 natural scene test images altogether) from 14 test subjects were collected. A triplet comparison setup and a hybrid qualitative/quantitative methodology 1 were applied. MOS d ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
ABSTRACT Subjective image quality data for 9 image processing pipes and 8 image contents (taken with mobile phone camera, 72 natural scene test images altogether) from 14 test subjects were collected. A triplet comparison setup and a hybrid qualitative/quantitative methodology 1 were applied. MOS data and spontaneous, subjective image quality attributes to each test image were recorded. The use of positive and negative image quality attributes by the experimental subjects suggested a significant difference between the subjective spaces of low and high image quality. The robustness of the attribute data was shown by correlating DMOS data of the test images against their corresponding, average subjective attribute vector length data. The findings demonstrate the information value of spontaneous, subjective image quality attributes in evaluating image quality at variable quality levels. We discuss the implications of these findings for the development of sensitive performance measures and methods in profiling image processing systems and their components, especially at high image quality levels.
Multi-Level Visual Alphabets
"... Abstract — A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-ba ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
Abstract — A central debate in visual perception theory is the argument for indirect versus direct perception; i.e., the use of intermediate, abstract, and hierarchical representations versus direct semantic interpretation of images through interaction with the outside world. We present a content-based representation that combines both approaches. The previously developed Visual Alphabet method is extended with a hierarchy of representations, each level feeding into the next one, but based on features that are not abstract but directly relevant to the task at hand. Explorative benchmark experiments are carried out on face images to investigate and explain the impact of the key parameters such as pattern size, number of prototypes, and distance measures used. Results show that adding an additional middle layer improves results, by encoding the spatial co-occurrence of lower-level pattern prototypes.
Perception of differences in naturalistic dynamic scenes, and a V1-based model
"... We investigate whether a computational model of V1 can predict how observers rate perceptual differences between paired movie clips of natural scenes. Observers viewed 198 pairs of movies clips, rating how different the two clips appeared to them on a magnitude scale. Sixty-six of the movie pairs w ..."
Abstract
- Add to MetaCart
(Show Context)
We investigate whether a computational model of V1 can predict how observers rate perceptual differences between paired movie clips of natural scenes. Observers viewed 198 pairs of movies clips, rating how different the two clips appeared to them on a magnitude scale. Sixty-six of the movie pairs were naturalistic and those remaining were low-pass or high-pass spatially filtered versions of those originals. We examined three ways of comparing a movie pair. The Spatial Model compared corresponding frames between each movie pairwise, combining those differences using Minkowski summation. The Temporal Model compared successive frames within each movie, summed those differences for each movie, and then compared the overall differences between the paired movies. The Ordered-Temporal Model combined elements from both models, and yielded the single strongest predictions of observers' ratings. We modeled naturalistic sustained and transient impulse functions and compared frames directly with no temporal filtering. Overall, modeling naturalistic temporal filtering improved the models' performance; in particular, the predictions of the ratings for low-pass spatially filtered movies were much improved by employing a transient impulse function. The correlations between model predictions and observers' ratings rose from 0.507 without temporal filtering to 0.759 (p ¼ 0.01%) when realistic impulses were included. The sustained impulse function and the Spatial Model carried more weight in ratings for normal and high-pass movies, whereas the transient impulse function with the Ordered-Temporal Model was most important for spatially low-pass movies. This is consistent with models in which high spatial frequency channels with sustained responses primarily code for spatial details in movies, while low spatial frequency channels with transient responses code for dynamic events.
Music and natural image processing share a common feature-integration rule
"... The world is rich in sensory information, and the challenge for any neural sensory system is to piece together the diverse messages from large arrays of feature detectors. In vision and auditory research, there has been speculation about the rules governing combination of signals from different neur ..."
Abstract
- Add to MetaCart
(Show Context)
The world is rich in sensory information, and the challenge for any neural sensory system is to piece together the diverse messages from large arrays of feature detectors. In vision and auditory research, there has been speculation about the rules governing combination of signals from different neural channels: e.g. linear (city-block) addition, Euclidian (energy) summation, or a maximum rule. These are all special cases of a more general Minkowski summation rule (Cue 1 m +Cue2 m) 1/m, where m=1, 2 and infinity respectively. Recently, we reported that Minkowski summation with exponent m=2.84 accurately models combination of visual cues in photographs [To et al. (2008). Proc Roy Soc B, 275, 2299]. Here, we ask whether this rule is equally applicable to cue combinations across different auditory dimensions: such as intensity, pitch, timbre and content. We found that in suprathreshold discrimination tasks using musical sequences, a Minkowski summation with exponent close to 3 (m=2.95) outperformed city-block, Euclidian or maximum combination rules in describing cue integration across feature dimensions. That the same exponent is found in this music experiment and our previous vision experiments, suggests the possibility of a universal “Minkowski summation Law ” in sensory feature integration. We postulate that this particular Minkowski exponent relates to the degree of correlation in activity between different sensory neurons when stimulated by natural stimuli, and could reflect an overall economical and efficient encoding mechanism underlying perceptual integration of features in the natural world.
Reviewed by:
, 2013
"... Processing bimodal stimuli: integrality/separability of color ..."
(Show Context)
Pattern Integration in the Normal and Abnormal Human Visual System
, 2013
"... Some pages of this thesis may have been removed for copyright restrictions. If you have discovered material in AURA which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to ..."
Abstract
- Add to MetaCart
Some pages of this thesis may have been removed for copyright restrictions. If you have discovered material in AURA which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to