Results 1 -
5 of
5
Boosting bottom-up and top-down visual features for saliency estimation
- in CVPR, 2012
"... Abstract Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations in free-viewing of natural scenes. Majority of models are based on low-level visual features and the importance of top-down factors has not yet been ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
(Show Context)
Abstract Despite significant recent progress, the best available visual saliency models still lag behind human performance in predicting eye fixations in free-viewing of natural scenes. Majority of models are based on low-level visual features and the importance of top-down factors has not yet been fully explored or modeled. Here, we combine low-level features such as orientation, color, intensity, saliency maps of previous best bottom-up models with top-down cognitive visual features (e.g., faces, humans, cars, etc.) and learn a direct mapping from those features to eye fixations using Regression, SVM, and AdaBoost classifiers. By extensive experimenting over three benchmark eye-tracking datasets using three popular evaluation scores, we show that our boosting model outperforms 27 state-of-the-art models and is so far the closest model to the accuracy of human model for fixation prediction. Furthermore, our model successfully detects the most salient object in a scene without sophisticated image processings such as region segmentation.
Learning high-level concepts by training a deep network on eye fixations
- in: Deep Learning and Unsupervised Feature Learning Workshop, In Conduction with NIPS, Lake Tahoe
, 2012
"... Visual attention is the ability to select visual stimuli that are most behaviorally relevant among the many others. It allows us to allocate our limited processing resources to the most informative part of the visual scene. In this paper, we learn general high-level concepts with the aid of selectiv ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Visual attention is the ability to select visual stimuli that are most behaviorally relevant among the many others. It allows us to allocate our limited processing resources to the most informative part of the visual scene. In this paper, we learn general high-level concepts with the aid of selective attention in a principled un-supervised framework, where a three layer deep network is built and greedy layer-wise training is applied to learn mid- and high- level features from salient regions of images. The network is demonstrated to be able to successfully learn mean-ingful high-level concepts such as faces and texts in the third-layer and mid-level features like junctions, textures, and parallelism in the second-layer. Unlike pre-trained object detectors that are recently included in saliency models to predict semantic objects, the higher-level features we learned are general base features that are not restricted to one or few object categories. A saliency model built upon the learned features demonstrates its competitive predictive power in natural scenes compared with existing methods. 1
Advances in Learning Visual Saliency: From Image Primitives to Semantic Contents
, 2014
"... Humans and other primates shift their gaze to allocate processing resources to a subset of the visual input. Understanding and emulating the way that human observers free-view a natural scene has both scientific and economic ..."
Abstract
- Add to MetaCart
Humans and other primates shift their gaze to allocate processing resources to a subset of the visual input. Understanding and emulating the way that human observers free-view a natural scene has both scientific and economic
unknown title
"... Abstract. Webpage is becoming a more and more important visual in-put to us. While there are few studies on saliency in webpage, we in this work make a focused study on how humans deploy their attention when viewing webpages and for the first time propose a computational model that is designed to pr ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Webpage is becoming a more and more important visual in-put to us. While there are few studies on saliency in webpage, we in this work make a focused study on how humans deploy their attention when viewing webpages and for the first time propose a computational model that is designed to predict webpage saliency. A dataset is built with 149 webpages and eye tracking data from 11 subjects who free-view the web-pages. Inspired by the viewing patterns on webpages, multi-scale feature maps that contain object blob representation and text representation are integrated with explicit face maps and positional bias. We propose to use multiple kernel learning (MKL) to achieve a robust integration of various feature maps. Experimental results show that the proposed model outperforms its counterparts in predicting webpage saliency.
Semantic saliency Gaze prediction
, 2014
"... to predict eye fixations for semantic contents using ..."
(Show Context)