Results 1 - 10
of
19
EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow
, 2015
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
1Region-based Convolutional Networks for Accurate Object Detection and Segmentation
"... Abstract—Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this pap ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
(Show Context)
Abstract—Object detection performance, as measured on the canonical PASCAL VOC Challenge datasets, plateaued in the final years of the competition. The best-performing methods were complex ensemble systems that typically combined multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 50 % relative to the previous best result on VOC 2012—achieving a mAP of 62.4%. Our approach combines two ideas: (1) one can apply high-capacity convolutional networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data are scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, boosts performance significantly. Since we combine region proposals with CNNs, we call the resulting model an R-CNN or Region-based Convolutional Network. Source code for the complete system is available at
Category-Specific Object Reconstruction from a Single Image
"... Object reconstruction from a single image – in the wild – is a problem where we can make progress and get meaningful results today. This is the main message of this paper, which introduces the first fully automatic pipeline having pixels as inputs and dense 3D surfaces of various rigid categories as ..."
Abstract
-
Cited by 2 (1 self)
- Add to MetaCart
(Show Context)
Object reconstruction from a single image – in the wild – is a problem where we can make progress and get meaningful results today. This is the main message of this paper, which introduces the first fully automatic pipeline having pixels as inputs and dense 3D surfaces of various rigid categories as outputs in images of realistic scenes. At the core of our approach are novel deformable 3D models that can be learned from 2D annotations available in existing object detection datasets, that can be driven by noisy automatic object segmentations and which we complement with a bottom-up module for recovering high-frequency shape details. We perform a comprehensive quantitative analysis and ablation study of our approach using the recently introduced PASCAL 3D+ dataset and show very encouraging automatic reconstructions on PASCAL VOC. 1.
Learning to propose objects
- CVPR
"... We present an approach for highly accurate bottom-up object segmentation. Given an image, the approach rapidly generates a set of regions that delineate candidate objects in the image. The key idea is to train an ensem-ble of figure-ground segmentation models. The ensemble is trained jointly, enabli ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
We present an approach for highly accurate bottom-up object segmentation. Given an image, the approach rapidly generates a set of regions that delineate candidate objects in the image. The key idea is to train an ensem-ble of figure-ground segmentation models. The ensemble is trained jointly, enabling individual models to specialize and complement each other. We reduce ensemble training to a sequence of uncapacitated facility location problems and show that highly accurate segmentation ensembles can be trained by combinatorial optimization. The training pro-cedure jointly optimizes the size of the ensemble, its compo-sition, and the parameters of incorporated models, all for the same objective. The ensembles operate on elementary image features, enabling rapid image analysis. Extensive experiments demonstrate that the presented approach out-performs prior object proposal algorithms by a significant margin, while having the lowest running time. The trained ensembles generalize across datasets, indicating that the presented approach is capable of learning a generally ap-plicable model of bottom-up segmentation. 1.
Self-Taught Object Localization with Deep Networks
"... The reliance on plentiful and detailed manual annota-tions for training is a critical limitation of the current state of the art in object localization and detection. This pa-per introduces self-taught object localization, a novel ap-proach that leverages deep convolutional networks trained for whol ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
The reliance on plentiful and detailed manual annota-tions for training is a critical limitation of the current state of the art in object localization and detection. This pa-per introduces self-taught object localization, a novel ap-proach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i.e., without using any ground-truth bounding boxes for training. The key idea is to analyze the change in the recognition scores when ar-tificially masking out different regions of the image. The masking out of a region that contains an object typically causes a significant drop in recognition. This idea is embed-ded into an agglomerative clustering technique that gener-ates self-taught localization hypotheses. For a small num-ber of hypotheses, our object localization scheme yields a relative gain of more than 22 % in both precision and re-call with respect to the state of the art (BING and Selective Search) for top-1 subwindow proposal. Our experiments on a challenging dataset of 200 classes indicate that our automatically-generated annotations are accurate enough to train object detectors in a weakly-supervised fashion with recognition results remarkably close to those obtained by training on manually annotated bounding boxes. 1.
Object-based RGBD Image Co-segmentation with Mutex Constraint
"... We present an object-based co-segmentation method that takes advantage of depth data and is able to correctly han-dle noisy images in which the common foreground object is missing. With RGBD images, our method utilizes the depth channel to enhance identification of similar foreground ob-jects via a ..."
Abstract
- Add to MetaCart
We present an object-based co-segmentation method that takes advantage of depth data and is able to correctly han-dle noisy images in which the common foreground object is missing. With RGBD images, our method utilizes the depth channel to enhance identification of similar foreground ob-jects via a proposed RGBD co-saliency map, as well as to improve detection of object-like regions and provide depth-based local features for region comparison. To accurately deal with noisy images where the common object appears more than or less than once, we formulate co-segmentation in a fully-connected graph structure together with mutu-al exclusion (mutex) constraints that prevent improper so-lutions. Experiments show that this object-based RGBD co-segmentation with mutex constraints outperforms relat-ed techniques on an RGBD co-segmentation dataset, while effectively processing noisy images. Moreover, we show that this method also provides performance comparable to state-of-the-art RGB co-segmentation techniques on regular RGB images with depth maps estimated from them. 1.
Salient Object Detection: A Survey
"... Abstract—Detecting and segmenting salient objects in natural scenes, also known as salient object detection, has attracted a lot of focused research in computer vision and has resulted in many applications. However, while many such models exist, a deep understanding of achievements and issues is lac ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract—Detecting and segmenting salient objects in natural scenes, also known as salient object detection, has attracted a lot of focused research in computer vision and has resulted in many applications. However, while many such models exist, a deep understanding of achievements and issues is lacking. We aim to provide a comprehensive review of the recent progress in this field. We situate salient object detection among other closely related areas such as generic scene segmentation, object proposal generation, and saliency for fixation prediction. Covering 256 publications we survey i) roots, key concepts, and tasks, ii) core techniques and main modeling trends, and iii) datasets and evaluation metrics in salient object detection. We also discuss open problems such as evaluation metrics and dataset bias in model performance and suggest future research directions. Index Terms—Salient object detection, salient region detection, saliency, explicit saliency, visual attention, regions of interest, objectness, object proposal, segmentation, interestingness, importance, eye movements, scene understanding
Deep Networks for Saliency Detection via Local Estimation and Global Search
"... Abstract This paper presents a saliency detection algorithm by integrating both local estimation and global search. In the local estimation stage, we detect local saliency by using a deep neural network (DNN-L) which learns local patch features to determine the saliency value of each pixel. The est ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract This paper presents a saliency detection algorithm by integrating both local estimation and global search. In the local estimation stage, we detect local saliency by using a deep neural network (DNN-L) which learns local patch features to determine the saliency value of each pixel. The estimated local saliency maps are further refined by exploring the high level object concepts. In the global search stage, the local saliency map together with global contrast and geometric information are used as global features to describe a set of object candidate regions. Another deep neural network (DNN-G) is trained to predict the saliency score of each object region based on the global features. The final saliency map is generated by a weighted sum of salient object regions. Our method presents two interesting insights. First, local features learned by a supervised scheme can effectively capture local contrast, texture and shape information for saliency detection. Second, the complex relationship between different global saliency cues can be captured by deep networks and exploited principally rather than heuristically. Quantitative and qualitative experiments on several benchmark data sets demonstrate that our algorithm performs favorably against the state-of-theart methods.
APT: Action localization Proposals from dense Trajectories
"... Abstract This paper is on action localization in video with the aid of spatio-temporal proposals. To alleviate the computational expensive segmentation step of existing proposals, we propose bypassing the segmentations completely by generating proposals directly from the dense trajectories used to ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract This paper is on action localization in video with the aid of spatio-temporal proposals. To alleviate the computational expensive segmentation step of existing proposals, we propose bypassing the segmentations completely by generating proposals directly from the dense trajectories used to represent videos during classification. Our Action localization Proposals from dense Trajectories (APT) use an efficient proposal generation algorithm to handle the high number of trajectories in a video. Our spatio-temporal proposals are faster than current methods and outperform the localization and classification accuracy of current proposals on the UCF Sports, UCF 101, and MSR-II video datasets. Corrected version: we fixed a mistake in our UCF-101 ground truth. Numbers are different; conclusions are unchanged.
Oriented Object Proposals
"... In this paper, we propose a new approach to generate oriented object proposals (OOPs) to reduce the detection error caused by various orientations of the object. To this end, we propose to efficiently locate object regions accord-ing to pixelwise object probability, rather than measuring the objectn ..."
Abstract
- Add to MetaCart
(Show Context)
In this paper, we propose a new approach to generate oriented object proposals (OOPs) to reduce the detection error caused by various orientations of the object. To this end, we propose to efficiently locate object regions accord-ing to pixelwise object probability, rather than measuring the objectness from a set of sampled windows. We formu-late the proposal generation problem as a generative proba-bilistic model such that object proposals of different shapes (i.e., sizes and orientations) can be produced by locating the local maximum likelihoods. The new approach has three main advantages. First, it helps the object detector han-dle objects of different orientations. Second, as the shapes of the proposals may vary to fit the objects, the result-ing proposals are tighter than the sampling windows with fixed sizes. Third, it avoids massive window sampling, and thereby reducing the number of proposals while maintain-ing a high recall. Experiments on the PASCAL VOC 2007 dataset show that the proposed OOP outperforms the state-of-the-art fast methods. Further experiments show that the rotation invariant property helps a class-specific object de-tector achieve better performance than the state-of-the-art proposal generation methods in either object rotation sce-narios or general scenarios. Generating OOPs is very fast and takes only 0.5s per image. 1.