Results 1 - 10
of
41
Dictionary-based Face Recognition from Video
"... Abstract. The main challenge in recognizing faces in video is effectively exploiting the multiple frames of a face and the accompanying dynamic signature. One prominent method is based on extracting joint appearance and behavioral features. A second method models a person by temporal correlations of ..."
Abstract
-
Cited by 20 (7 self)
- Add to MetaCart
(Show Context)
Abstract. The main challenge in recognizing faces in video is effectively exploiting the multiple frames of a face and the accompanying dynamic signature. One prominent method is based on extracting joint appearance and behavioral features. A second method models a person by temporal correlations of features in a video. Our approach introduces the concept of video-dictionaries for face recognition, which generalizes the work in sparse representation and dictionaries for faces in still images. Video-dictionaries are designed to implicitly encode temporal, pose, and illumination information. We demonstrate our method on the Face and Ocular Challenge Series (FOCS) Video Challenge, which consists of unconstrained video sequences. We show that our method is efficient and performs significantly better than many competitive video-based face recognition algorithms. 1
On computing the Riemannian 1-Center
- Computational Geometry: Theory and Applications
"... Abstract We generalize the Euclidean 1-center approximation algorithm of ..."
Abstract
-
Cited by 9 (2 self)
- Add to MetaCart
Abstract We generalize the Euclidean 1-center approximation algorithm of
Video-based Face Recognition via Joint Sparse Representation
"... Abstract — In video-based face recognition, a key challenge is in exploiting the extra information available in a video; e.g., face, body, and motion identity cues. In addition, different video sequences of the same subject may contain variations in resolution, illumination, pose, and facial express ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
(Show Context)
Abstract — In video-based face recognition, a key challenge is in exploiting the extra information available in a video; e.g., face, body, and motion identity cues. In addition, different video sequences of the same subject may contain variations in resolution, illumination, pose, and facial expressions. These variations contribute to the challenges in designing an effective video-based face-recognition algorithm. We propose a novel multivariate sparse representation method for video-to-video face recognition. Our method simultaneously takes into account correlations as well as coupling information among the video frames. Our method jointly represents all the video data by a sparse linear combination of training data. In addition, we modify our model so that it is robust in the presence of noise and occlusion. Furthermore, we kernelize the algorithm to handle the non-linearities present in video data. Numerous experiments using unconstrained video sequences show that our method is effective and performs significantly better than many state-ofthe-art video-based face recognition algorithms in the literature. I.
Semi-intrinsic mean shift on riemannian manifolds
- Proc. European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science
, 2012
"... Abstract. The original mean shift algorithm [1] on Euclidean spaces (MS) was extended in [2] to operate on general Riemannian manifolds. This extension is extrinsic (Ext-MS) since the mode seeking is performed on the tangent spaces [3], where the underlying curvature is not fully con-sidered (tangen ..."
Abstract
-
Cited by 7 (2 self)
- Add to MetaCart
(Show Context)
Abstract. The original mean shift algorithm [1] on Euclidean spaces (MS) was extended in [2] to operate on general Riemannian manifolds. This extension is extrinsic (Ext-MS) since the mode seeking is performed on the tangent spaces [3], where the underlying curvature is not fully con-sidered (tangent spaces are only valid in a small neighborhood). In [3] was proposed an intrinsic mean shift designed to operate on two par-ticular Riemannian manifolds (IntGS-MS), i.e. Grassmann and Stiefel manifolds (using manifold-dedicated density kernels). It is then natural to ask whether mean shift could be intrinsically extended to work on a large class of manifolds. We propose a novel paradigm to intrinsically reformulate the mean shift on general Riemannian manifolds. This is ac-complished by embedding the Riemannian manifold into a Reproducing Kernel Hilbert Space (RKHS) by using a general and mathematically well-founded Riemannian kernel function, i.e. heat kernel [4]. The key issue is that when the data is implicitly mapped to the Hilbert space, the curvature of the manifold is taken into account (i.e. exploits the underlying information of the data). The inherent optimization is then performed on the embedded space. Theoretic analysis and experimental results demonstrate the promise and effectiveness of this novel paradigm. 1
A blur-robust descriptor with applications to face recognition
- IEEE Trans Pattern Anal. Mach. Intell
"... Abstract—Understanding the effect of blur is an important problem in unconstrained visual analysis. We address this problem in the context of image-based recognition, by a fusion of image-formation models, and differential geometric tools. First, we discuss the space spanned by blurred versions of a ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
(Show Context)
Abstract—Understanding the effect of blur is an important problem in unconstrained visual analysis. We address this problem in the context of image-based recognition, by a fusion of image-formation models, and differential geometric tools. First, we discuss the space spanned by blurred versions of an image and then under certain assumptions, provide a differential geometric analysis of that space. More specifically, we create a subspace resulting from convolution of an image with a complete set of orthonormal basis functions of a pre-specified maximum size (that can represent an arbitrary blur kernel within that size), and show that the corresponding subspaces created from a clean image and its blurred versions are equal under the ideal case of zero noise, and some assumptions on the properties of blur kernels. We then study the practical utility of this subspace representation for the problem of direct recognition of blurred faces, by viewing the subspaces as points on the Grassmann manifold and present methods to perform recognition for cases where the blur is both homogenous and spatially varying. We empirically analyze the effect of noise, as well as the presence of other facial variations between the gallery and probe images, and provide comparisons with existing approaches on standard datasets.
CLUSTERING ON GRASSMANN MANIFOLDS VIA KERNEL EMBEDDING WITH APPLICATION TO ACTION ANALYSIS
"... With the aim of improving the clustering of data (such as image sequences) lying on Grassmann manifolds, we propose to embed the manifolds into Reproducing Kernel Hilbert Spaces. To this end, we define a measure of cluster distortion and embed the manifolds such that the distortion is minimised. We ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
With the aim of improving the clustering of data (such as image sequences) lying on Grassmann manifolds, we propose to embed the manifolds into Reproducing Kernel Hilbert Spaces. To this end, we define a measure of cluster distortion and embed the manifolds such that the distortion is minimised. We show that the optimal solution is a generalised eigenvalue problem that can be solved very efficiently. Experiments on several clustering tasks (including human action clustering) show that in comparison to the recent intrinsic Grassmann k-means algorithm, the proposed approach obtains notable improvements in clustering accuracy, while also being several orders of magnitude faster.
Efficient higher-order clustering on the grassmann manifold
- In ICCV
, 2013
"... The higher-order clustering problem arises when data is drawn from multiple subspaces or when observations fit a higher-order parametric model. Most solutions to this problem either decompose higher-order similarity measures for use in spectral clustering or explicitly use low-rank ma-trix represent ..."
Abstract
-
Cited by 5 (0 self)
- Add to MetaCart
(Show Context)
The higher-order clustering problem arises when data is drawn from multiple subspaces or when observations fit a higher-order parametric model. Most solutions to this problem either decompose higher-order similarity measures for use in spectral clustering or explicitly use low-rank ma-trix representations. In this paper we present our approach of Sparse Grassmann Clustering (SGC) that combines at-tributes of both categories. While we decompose the higher-order similarity tensor, we cluster data by directly finding a low dimensional representation without explicitly build-ing a similarity matrix. By exploiting recent advances in online estimation on the Grassmann manifold (GROUSE) we develop an efficient and accurate algorithm that works with individual columns of similarities or partial observa-tions thereof. Since it avoids the storage and decomposition of large similarity matrices, our method is efficient, scal-able and has low memory requirements even for large-scale data. We demonstrate the performance of our SGC method on a variety of segmentation problems including planar seg-mentation of Kinect depth maps and motion segmentation of the Hopkins 155 dataset for which we achieve performance comparable to the state-of-the-art. 1.
Learning non-linear reconstruction models for image set classification
- In IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR
, 2014
"... We propose a deep learning framework for image set classification with application to face recognition. An Adaptive Deep Network Template (ADNT) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (G ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
(Show Context)
We propose a deep learning framework for image set classification with application to face recognition. An Adaptive Deep Network Template (ADNT) is defined whose parameters are initialized by performing unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines (GRBMs). The pre-initialized ADNT is then separately trained for images of each class and class-specific models are learnt. Based on the minimum reconstruction error from the learnt class-specific mod-els, a majority voting strategy is used for classification. The proposed framework is extensively evaluated for the task of image set classification based face recognition on Honda/UCSD, CMU Mobo, YouTube Celebrities and a Kinect dataset. Our experimental results and comparisons with existing state-of-the-art methods show that the pro-posed method consistently achieves the best performance on all these datasets. 1.
Rolling riemannian manifolds to solve the multiclass classification problem
- In CVPR
"... Abstract In the past few years there has been a growing interest on geometric frameworks to learn supervised classification models on Riemannian manifolds ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
(Show Context)
Abstract In the past few years there has been a growing interest on geometric frameworks to learn supervised classification models on Riemannian manifolds
Elastic functional coding of human actions: From vector-fields to latent variables
- In CVPR
, 2015
"... Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addi-tion to being no ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
(Show Context)
Human activities observed from visual sensors often give rise to a sequence of smoothly varying features. In many cases, the space of features can be formally defined as a manifold, where the action becomes a trajectory on the manifold. Such trajectories are high dimensional in addi-tion to being non-linear, which can severely limit computa-tions on them. We also argue that by their nature, human ac-tions themselves lie on a much lower dimensional manifold compared to the high dimensional feature space. Learning an accurate low dimensional embedding for actions could have a huge impact in the areas of efficient search and retrieval, visualization, learning, and recognition. Tradi-tional manifold learning addresses this problem for static points in Rn, but its extension to trajectories on Rieman-