Results 1 - 10
of
168
Hello! My name is... Buffy – Automatic naming of characters in TV video
- In BMVC
, 2006
"... We investigate the problem of automatically labelling appearances of characters in TV or film material. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precisio ..."
Abstract
-
Cited by 174 (20 self)
- Add to MetaCart
(Show Context)
We investigate the problem of automatically labelling appearances of characters in TV or film material. This is tremendously challenging due to the huge variation in imaged appearance of each character and the weakness and ambiguity of available annotation. However, we demonstrate that high precision can be achieved by combining multiple sources of information, both visual and textual. The principal novelties that we introduce are: (i) automatic generation of time stamped character annotation by aligning subtitles and transcripts; (ii) strengthening the supervisory information by identifying when characters are speaking; (iii) using complementary cues of face matching and clothing matching to propose common annotations for face tracks. Results are presented on episodes of the TV series “Buffy the Vampire Slayer”. 1
Towards multiview object class detection
- in "IEEE Conference on Computer Vision & Pattern Recognition
"... HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte p ..."
Abstract
-
Cited by 127 (5 self)
- Add to MetaCart
(Show Context)
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Robust object detection via soft cascade
- In Proc. Intl. Conf. on Computer Vision and Pattern Recognition
, 2005
"... We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for easier training and a detector with fewer features. In addition, the method allows ..."
Abstract
-
Cited by 99 (1 self)
- Add to MetaCart
(Show Context)
We describe a method for training object detectors using a generalization of the cascade architecture, which results in a detection rate and speed comparable to that of the best published detectors while allowing for easier training and a detector with fewer features. In addition, the method allows for quickly calibrating the detector for a target detection rate, false positive rate or speed. One important advantage of our method is that it enables systematic exploration of the ROC Surface, which characterizes the trade-off between accuracy and speed for a given classifier. 1.
Illumination Invariant Face Recognition Using Near-Infrared Images
"... Abstract—Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thusconstrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a nove ..."
Abstract
-
Cited by 81 (12 self)
- Add to MetaCart
(Show Context)
Abstract—Most current face recognition systems are designed for indoor, cooperative-user applications. However, even in thusconstrained applications, most existing systems, academic and commercial, are compromised in accuracy by changes in environmental illumination. In this paper, we present a novel solution for illumination invariant face recognition for indoor, cooperative-user applications. First, we present an active near infrared (NIR) imaging system that is able to produce face images of good condition regardless of visible lights in the environment. Second, we show that the resulting face images encode intrinsic information of the face, subject only to a monotonic transform in the gray tone; based on this, we use local binary pattern (LBP) features to compensate for the monotonic transform, thus deriving an illumination invariant face representation. Then, we present methods for face recognition using NIR images; statistical learning algorithms are used to extract most discriminative features from a large pool of invariant LBP features and construct a highly accurate face matching engine. Finally, we present a system that is able to achieve accurate and fast face recognition in practice, in which a method is provided to deal with specular reflections of active NIR lights on eyeglasses, a critical issue in active NIR imagebased face recognition. Extensive, comparative results are provided to evaluate the imaging hardware, the face and eye detection algorithms, and the face recognition algorithms and systems, with respect to various factors, including illumination, eyeglasses, time lapse, and ethnic groups.
Real-time compressive tracking
- In ECCV
"... Abstract. It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent ..."
Abstract
-
Cited by 75 (8 self)
- Add to MetaCart
(Show Context)
Abstract. It is a challenging task to develop effective and efficient appearance models for robust object tracking due to factors such as pose variation, illumination change, occlusion, and motion blur. Existing online tracking algorithms often update models with samples from observations in recent frames. While much success has been demonstrated, numerous issues remain to be addressed. First, while these adaptive appearance models are data-dependent, there does not exist sufficient amount of data for online algorithms to learn at the outset. Second, online tracking algorithms often encounter the drift problems. As a result of self-taught learning, these mis-aligned samples are likely to be added and degrade the appearance models. In this paper, we propose a simple yet effective and efficient tracking algorithm with an appearance model based on features extracted from the multi-scale image feature space with data-independent basis. Our appearance model employs nonadaptive random projections that preserve the structure of the image feature space of objects. A very sparse measurement matrix is adopted to efficiently extract the features for the appearance model. We compress samples of foreground targets and the background using the same sparse measurement matrix. The tracking task is formulated as a binary classification via a naive Bayes classifier with online update in the compressed domain. The proposed compressive tracking algorithm runs in real-time and performs favorably against state-of-the-art algorithms on challenging sequences in terms of efficiency, accuracy and robustness. 1
Joint haar-like features for face detection
- in ICCV
"... In this paper, we propose a new distinctive feature, called joint Haar-like feature, for detecting faces in images. This is based on co-occurrence of multiple Haar-like features. Feature co-occurrence, which captures the structural simi-larities within the face class, makes it possible to construct ..."
Abstract
-
Cited by 45 (0 self)
- Add to MetaCart
(Show Context)
In this paper, we propose a new distinctive feature, called joint Haar-like feature, for detecting faces in images. This is based on co-occurrence of multiple Haar-like features. Feature co-occurrence, which captures the structural simi-larities within the face class, makes it possible to construct an effective classifier. The joint Haar-like feature can be calculated very fast and has robustness against addition of noise and change in illumination. A face detector is learned by stagewise selection of the joint Haar-like features using AdaBoost. A small number of distinctive features achieve both computational efficiency and accuracy. Experimental results with 5,676 face images and 30,000 nonface images show that our detector yields higher classification perfor-mance than Viola and Jones ’ detector, which uses a single feature for each weak classifier. Given the same number of features, our method reduces the error by 37%. Our de-tector is 2.6 times as fast as Viola and Jones ’ detector to achieve the same performance. 1.
Learning discriminant features for multi-view face and eye detection
- In Proc. CVPR
, 2005
"... In current face detection, mostly often used features are selected from a large set (e.g. Haar wavelets). Generally Haar wavelets only represent the local geometric feature. When applying those features to profile faces and eyes with irregular geometric patterns, the classifier accuracy is low in th ..."
Abstract
-
Cited by 33 (7 self)
- Add to MetaCart
(Show Context)
In current face detection, mostly often used features are selected from a large set (e.g. Haar wavelets). Generally Haar wavelets only represent the local geometric feature. When applying those features to profile faces and eyes with irregular geometric patterns, the classifier accuracy is low in the later training stages, only near 50%. In this paper, instead of brute-force searching the large feature set, we propose to statistically learn the discriminant features for object detection. Besides applying Fisher discriminant analysis(FDA) in AdaBoost, we further propose the recursive nonparametric discriminant analysis (RNDA) to handle more general cases. Those discriminant analysis features are not constrained with geometric shape and can provide better accuracy. The compact size of feature set allows to select a near optimal subset of features and construct the probabilistic classifiers by greedy searching. The proposed methods are applied to multi-view face and eye detection and achieve good accuracy. 1
Fast pedestrian detection using a cascade of boosted covariance features,” Circuits and Systems for Video Technology
- IEEE Transactions on
, 2008
"... Abstract—Efficiently and accurately detecting pedestrians plays a very important role in many computer vision applications such as video surveillance and smart cars. In order to find the right feature for this task, we first present a comprehensive experimental study on pedestrian detection using st ..."
Abstract
-
Cited by 30 (12 self)
- Add to MetaCart
(Show Context)
Abstract—Efficiently and accurately detecting pedestrians plays a very important role in many computer vision applications such as video surveillance and smart cars. In order to find the right feature for this task, we first present a comprehensive experimental study on pedestrian detection using state-of-the-art locally extracted fea-tures (e.g., local receptive fields, histogram of oriented gradients, and region covariance). Building upon the findings of our exper-iments, we propose a new, simpler pedestrian detector using the covariance features. Unlike the work in [1], where the feature selec-tion and weak classifier training are performed on the Riemannian manifold, we select features and train weak classifiers in the Eu-clidean space for faster computation. To this end, AdaBoost with weighted Fisher linear discriminant analysis-based weak classi-fiers are designed. A cascaded classifier structure is constructed for efficiency in the detection phase. Experiments on different datasets prove that the new pedestrian detector is not only comparable to the state-of-the-art pedestrian detectors but it also performs at a faster speed. To further accelerate the detection, we adopt a faster strategy—multiple layer boosting with heterogeneous features—to exploit the efficiency of the Haar feature and the discriminative power of the covariance feature. Experiments show that, by com-bining the Haar and covariance features, we speed up the original covariance feature detector [1] by up to an order of magnitude in detection time with a slight drop in detection performance. Index Terms—AdaBoost, boosting with heterogeneous features, local features, pedestrian detection/classification, support vector machine. I.
Probabilistic 3D polyp detection in CT images: The role of sample alignment
- In Proc. Conf. Computer Vision and Pattern Recognition, volume II
, 2006
"... Automatic polyp detection is an increasingly important task in medical imaging with virtual colonoscopy [15] being widely used. In this paper, we present a 3D object detection algorithm and show its application on polyp detection from CT images. We make the following contributions: (1) The system ad ..."
Abstract
-
Cited by 27 (10 self)
- Add to MetaCart
(Show Context)
Automatic polyp detection is an increasingly important task in medical imaging with virtual colonoscopy [15] being widely used. In this paper, we present a 3D object detection algorithm and show its application on polyp detection from CT images. We make the following contributions: (1) The system adopts Probabilistic Boosting Tree (PBT) to probabilistically detect polyps. Integral volume and 3D Haar filters are introduced to achieve fast feature computation. (2) We give an explicit convergence rate analysis for the AdaBoost algorithm [2] and prove that the error at each step ɛt+1. is tightly bounded by the previous error ɛt. (3) For a 3D polyp template, a generative model is defined. Given the bound and convergence analysis, we analyze the role of “sample alignment ” in the template design and devise a robust and efficient algorithm for polyp detection. The overall system has been tested on 150 volumes and the results obtained are very encouraging. 1 1.
BoostMap: An Embedding Method for Efficient Nearest Neighbor Retrieval
, 2008
"... This paper describes BoostMap, a method for efficient nearest neighbor retrieval under computationally expensive distance measures. Database and query objects are embedded into a vector space in which distances can be measured efficiently. Each embedding is treated as a classifier that predicts for ..."
Abstract
-
Cited by 23 (5 self)
- Add to MetaCart
(Show Context)
This paper describes BoostMap, a method for efficient nearest neighbor retrieval under computationally expensive distance measures. Database and query objects are embedded into a vector space in which distances can be measured efficiently. Each embedding is treated as a classifier that predicts for any three objects X, A, B whether X is closer to A or to B. It is shown that a linear combination of such embedding-based classifiers naturally corresponds to an embedding and a distance measure. Based on this property, the BoostMap method reduces the problem of embedding construction to the classical boosting problem of combining many weak classifiers into an optimized strong classifier. The classification accuracy of the resulting strong classifier is a direct measure of the amount of nearest neighbor structure preserved by the embedding. An important property of BoostMap is that the embedding optimization criterion is equally valid in both metric and nonmetric spaces. Performance is evaluated in databases of hand images, handwritten digits, and time series. In all cases, BoostMap significantly improves retrieval efficiency with small losses in accuracy compared to brute-force search. Moreover, BoostMap significantly outperforms existing nearest neighbor retrieval methods such as Lipschitz embeddings, FastMap, and VP-trees.