Results 1 - 10
of
52
Blessing of Dimensionality: High-dimensional Feature and Its Efficient Compression for Face Verification
"... Making a high-dimensional (e.g., 100K-dim) feature for face recognition seems not a good idea because it will bring difficulties on consequent training, computation, and storage. This prevents further exploration of the use of a highdimensional feature. In this paper, we study the performance of a h ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
(Show Context)
Making a high-dimensional (e.g., 100K-dim) feature for face recognition seems not a good idea because it will bring difficulties on consequent training, computation, and storage. This prevents further exploration of the use of a highdimensional feature. In this paper, we study the performance of a highdimensional feature. We first empirically show that high dimensionality is critical to high performance. A 100K-dim feature, based on a single-type Local Binary Pattern (LBP) descriptor, can achieve significant improvements over both its low-dimensional version and the state-of-the-art. We also make the high-dimensional feature practical. With our proposed sparse projection method, named rotated sparse regression, both computation and model storage can be reduced by over 100 times without sacrificing accuracy quality. 1.
H.: Annotated facial landmarks in the wild: A large-scale, real-world database for facial landmark localization
- In: Benchmarking Facial Image Analysis Technologies (ICCV Workshop
, 2011
"... Face alignment is a crucial step in face recognition tasks. Especially, using landmark localization for geometric face normalization has shown to be very effective, clearly improving the recognition results. However, no adequate databases exist that provide a sufficient number of annotated facial la ..."
Abstract
-
Cited by 42 (2 self)
- Add to MetaCart
(Show Context)
Face alignment is a crucial step in face recognition tasks. Especially, using landmark localization for geometric face normalization has shown to be very effective, clearly improving the recognition results. However, no adequate databases exist that provide a sufficient number of annotated facial landmarks. The databases are either limited to frontal views, provide only a small number of annotated images or have been acquired under controlled conditions. Hence, we introduce a novel database overcoming these limitations: Annotated Facial Landmarks in the Wild (AFLW). AFLW provides a large-scale collection of images gathered from Flickr, exhibiting a large variety in face appearance (e.g., pose, expression, ethnicity, age, gender) as well as general imaging and environmental conditions. In total 25,993 faces in 21,997 real-world images are annotated with up to 21 landmarks per image. Due to the comprehensive set of annotations AFLW is well suited to train and test algorithms for multi-view face detection, facial landmark localization and face pose estimation. Further, we offer a rich set of tools that ease the integration of other face databases and associated annotations into our joint framework. 1.
Bayesian Face Revisited: A Joint Formulation
"... Abstract. In this paper, we revisit the classical Bayesian face recognition method by Baback Moghaddam et al. and propose a new joint formulation. The classical Bayesian method models the appearance difference between two faces. We observe that this “difference ” formulation may reduce the separabil ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
(Show Context)
Abstract. In this paper, we revisit the classical Bayesian face recognition method by Baback Moghaddam et al. and propose a new joint formulation. The classical Bayesian method models the appearance difference between two faces. We observe that this “difference ” formulation may reduce the separability between classes. Instead, we model two faces jointly with an appropriate prior on the face representation. Our joint formulation leads to an EM-like model learning at the training time and an efficient, closed-formed computation at the test time. Onextensiveexperimental evaluations, our method is superior to the classical Bayesian face and many other supervised approaches. Our method achieved 92.4% test accuracy on the challenging Labeled Face in Wild (LFW) dataset. Comparing with current best commercial system, we reduced the error rate by 10%. 1
G.: A practical transfer learning algorithm for face verification
- In: ICCV. (2013
"... Face verification involves determining whether a pair of facial images belongs to the same or different subjects. This problem can prove to be quite challenging in many im-portant applications where labeled training data is scarce, e.g., family album photo organization software. Herein we propose a ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
(Show Context)
Face verification involves determining whether a pair of facial images belongs to the same or different subjects. This problem can prove to be quite challenging in many im-portant applications where labeled training data is scarce, e.g., family album photo organization software. Herein we propose a principled transfer learning approach for merg-ing plentiful source-domain data with limited samples from some target domain of interest to create a classifier that ide-ally performs nearly as well as if rich target-domain data were present. Based upon a surprisingly simple generative Bayesian model, our approach combines a KL-divergence-based regularizer/prior with a robust likelihood function leading to a scalable implementation via the EM algorithm. As justification for our design choices, we later use prin-ciples from convex analysis to recast our algorithm as an equivalent structured rank minimization problem leading to a number of interesting insights related to solution struc-ture and feature-transform invariance. These insights help to both explain the effectiveness of our algorithm as well as elucidate a wide variety of related Bayesian approaches. Experimental testing with challenging datasets validate the utility of the proposed algorithm. 1.
Leveraging billions of faces to overcome performance barriers in unconstrained face recognition. ArXiv eprints
, 2011
"... face.com ..."
(Show Context)
Large scale strongly supervised ensemble metric learning, with applications to face verification and retrieval
, 2011
"... Learning Mahanalobis distance metrics in a high-dimensional feature space is very difficult especially when structural sparsity and low rank are enforced to improve com-putational efficiency in testing phase. This paper addresses both aspects by an ensemble metric learning approach that consists of ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
(Show Context)
Learning Mahanalobis distance metrics in a high-dimensional feature space is very difficult especially when structural sparsity and low rank are enforced to improve com-putational efficiency in testing phase. This paper addresses both aspects by an ensemble metric learning approach that consists of sparse block diagonal metric ensembling and join-t metric learning as two consecutive steps. The former step pursues a highly sparse block diagonal metric by selecting effective feature groups while the latter one further exploits correlations between selected feature groups to obtain an accurate and low rank metric. Our algorithm considers all pairwise or triplet constraints generated from training samples with explicit class labels, and possesses good scala-bility with respect to increasing feature dimensionality and growing data volumes. Its applications to face verification and retrieval outperform existing state-of-the-art methods in accuracy while retaining high efficiency. 1.
Spoofing and countermeasures for automatic speaker verification
"... It is widely acknowledged that most biometric systems are vulnerable to spoofing, also known as imposture. While vulnerabilities and countermeasures for other biometric modalities have been widely studied, e.g. face verification, speaker verification systems remain vulnerable. This paper describes s ..."
Abstract
-
Cited by 8 (4 self)
- Add to MetaCart
(Show Context)
It is widely acknowledged that most biometric systems are vulnerable to spoofing, also known as imposture. While vulnerabilities and countermeasures for other biometric modalities have been widely studied, e.g. face verification, speaker verification systems remain vulnerable. This paper describes some specific vulnerabilities studied in the literature and presents a brief survey of recent work to develop spoofing countermeasures. The paper concludes with a discussion on the need for standard datasets, metrics and formal evaluations which are needed to assess vulnerabilities to spoofing in realistic scenarios without prior knowledge. Index Terms: spoofing, imposture, automatic speaker verification 1.
A PRACTICAL, SELF-ADAPTIVE VOICE ACTIVITY DETECTOR FOR SPEAKER VERIFICATION WITH NOISY TELEPHONE AND MICROPHONE DATA
"... A voice activity detector (VAD) plays a vital role in robust speaker verification, where energy VAD is most commonly used. Energy VAD works well in noise-free conditions but deteriorates in noisy conditions. One way to tackle this is to introduce speech enhancement preprocessing. We study an alterna ..."
Abstract
-
Cited by 8 (5 self)
- Add to MetaCart
(Show Context)
A voice activity detector (VAD) plays a vital role in robust speaker verification, where energy VAD is most commonly used. Energy VAD works well in noise-free conditions but deteriorates in noisy conditions. One way to tackle this is to introduce speech enhancement preprocessing. We study an alternative, likelihood ratio based VAD that trains speech and nonspeech models on an utterance-byutterance basis from mel-frequency cepstral coefficients (MFCCs). The training labels are obtained from enhanced energy VAD. As the speech and nonspeech models are re-trained for each utterance, minimum assumptions of the background noise are made. According to both VAD error analysis and speaker verification results utilizing state-of-the-art i-vector system, the proposed method outperforms energy VAD variants by a wide margin. We provide open-source implementation of the method. Index Terms — Voice activity detection, speaker verification 1.
Pairwise Support Vector Machines and their Application to Large Scale Problems
, 2012
"... Pairwise classification is the task to predict whether the examples a,b of a pair (a,b) belong to the same class or to different classes. In particular, interclass generalization problems can be treated in this way. In pairwise classification, the order of the two input examples should not affect th ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Pairwise classification is the task to predict whether the examples a,b of a pair (a,b) belong to the same class or to different classes. In particular, interclass generalization problems can be treated in this way. In pairwise classification, the order of the two input examples should not affect the classification result. To achieve this, particular kernels as well as the use of symmetric training sets in the framework of support vector machines were suggested. The paper discusses both approaches in a general way and establishes a strong connection between them. In addition, an efficient implementation is discussed which allows the training of several millions of pairs. The value of these contributions is confirmed by excellent results on the labeled faces in the wild benchmark.
Probabilistic Linear Discriminant Analysis with Bottleneck Features for Speech Recognition
"... We have recently proposed a new acoustic model based on prob-abilistic linear discriminant analysis (PLDA) which enjoys the flexibility of using higher dimensional acoustic features, and is more capable to capture the intra-frame feature correlations. In this paper, we investigate the use of bottlen ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
(Show Context)
We have recently proposed a new acoustic model based on prob-abilistic linear discriminant analysis (PLDA) which enjoys the flexibility of using higher dimensional acoustic features, and is more capable to capture the intra-frame feature correlations. In this paper, we investigate the use of bottleneck features obtained from a deep neural network (DNN) for the PLDA-based acous-tic model. Experiments were performed on the Switchboard dataset — a large vocabulary conversational telephone speech corpus. We observe significant word error reduction by using the bottleneck features. In addition, we have also compared the PLDA-based acoustic model to three others using Gaussian mixture models (GMMs), subspace GMMs and hybrid deep neural networks (DNNs), and PLDA can achieve comparable or slightly higher recognition accuracy from our experiments. Index Terms: speech recognition, bottleneck features, proba-bilistic linear discriminant analysis