## Learning a similarity metric discriminatively, with application to face verification (2005)

### Cached

### Download Links

- [yann.lecun.com]
- [yann.lecun.com]
- [www.cs.toronto.edu]
- DBLP

### Other Repositories/Bibliography

Venue: | In Proc. of Computer Vision and Pattern Recognition Conference |

Citations: | 95 - 11 self |

### BibTeX

@INPROCEEDINGS{Chopra05learninga,

author = {Sumit Chopra and Raia Hadsell and Yann Lecun},

title = {Learning a similarity metric discriminatively, with application to face verification},

booktitle = {In Proc. of Computer Vision and Pattern Recognition Conference},

year = {2005},

pages = {539--546},

publisher = {IEEE Press}

}

### Years of Citing Articles

### OpenURL

### Abstract

### Citations

2777 | Eigenfaces for Recognition
- Turk, Pentland
- 1991
(Show Context)
Citation Context ... a very small number of samples. 1.1. Previous Work The idea of mapping face images to low dimensional target spaces before comparison has a long history, starting with the PCA-based Eigenface method =-=[16]-=- in which ¥�¨���� is a linear projection trained non-discriminatively to maximize the variance. The LDA-based Fisherface method [3] is also linear, but trained discriminatively so as to maximize the r... |

1608 | Nonlinear dimensionality reduction by locally linear embedding
- Roweis, Saul
- 2000
(Show Context)
Citation Context ...ning framework for energy-based models (EBM). Our method is very different from other dimensionality reduction techniques such as Multi-Dimensional Scaling (MDS) [13] and Local Linear Embedding (LLE) =-=[15]-=- MDS computes a target vector from each input object in the training set based on known pairwise dissimilarities, without constructing a mapping. By contrast, our method produces a non-linear mapping ... |

1497 | Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection. Special Theme Issue on Face and Gesture Recognition of the
- Belhumeur, Hespanha, et al.
- 1997
(Show Context)
Citation Context ...infinity (see figure 2), thefollowing condition is sufficient: Condition 3 The negative of the gradient of H(EGW ; EIW )on the margin line EGW + m = EIW has a positive dot prod-uct with the direction =-=[-1, 1]-=-. To prove this, we state and prove the following theorem.sTheorem 1 Let H(EGW ; EIW ) be convex in EGW and EIWand have a minimum at infinity. Assume that there exists a W for a sample point such that... |

732 | Gradient-based learning applied to document recognition
- Bengio, Haffner
(Show Context)
Citation Context ... the choice of ¥§¦©¨���� . In particular, we will use architectures which are designed to extract representations that are robust to geometric distortions of the input, such as convolutional networks =-=[8]-=-. The resulting similarity metric will be robust to small differences of the pose between the pairs of images. Since the dimension of the target space is low and the natural distance in that space is ... |

302 |
The AR face database
- Martinez, Benavente
- 1998
(Show Context)
Citation Context ... 56x46 using 4x4 subsampling. The second set of training and testing experiments was performed by combining two datasets: the AR Database of Faces, created at Purdue University and publicly available =-=[11]-=-, and a subset of the grayscale Feret Database [2]. Image pairs from both of these datasets were used in training, but only images from the AR dataset were used for testing. The AR dataset comprises 3... |

155 | Face recognition: A convolutional neural network approach - Lawrence, Tsoi, et al. - 1997 |

148 | Recognizing Imprecisely Localized, Partially Occluded and Expression Variant Faces from a Single Sample per Class
- Martinez
- 2002
(Show Context)
Citation Context ...example, which has been applied to face recognition, is elastic matching [6]. Others have advocated warping-based normalization algorithms to maximally reduce the variations of appearance due to pose =-=[10]-=-. The invariance properties of all these models are hand-designed in advance. In the method described in this paper, the invariance properties do not come from prior knowledge about the task, but they... |

126 | Transformation invariance in pattern recognition - tangent distance and tangent propagation
- Simard, Cun, et al.
- 1998
(Show Context)
Citation Context ...al expression, glasses, and obscuring scarves). Some authors have described similarity metrics that are locally invariant to a set of known transformations. One example is the Tangent Distance method =-=[19]-=-. Another example, which has been applied to face recognition, is elastic matching [6]. Others have advocated warping-based normalization algorithms to maximally reduce the variations of appearance du... |

51 | Energy-based models for sparse overcomplete representations
- Teh, Welling, et al.
- 2003
(Show Context)
Citation Context ...le probabilistic models assign a normalized probability to every possible configuration of the variables being modeled, energy-based models (EBM) assign an unnormalized energy to those configurations =-=[18, 9]-=-. Prediction in such systems is performed by searching for configurations of the variables that minimize the energy. EBMs are used in situations where the energies for various configurations must be c... |

44 | The FERET verification testing protocol for face recognition algorithms
- Moon, Phillips
- 1998
(Show Context)
Citation Context ...s because making the probability of a particular pair high automatically makes the probability of other pairs low. 2.1. Face Verification with Learned Similarity Metrics The task of face verification =-=[12]-=-, is to accept or reject the claimed identity of a subject in an image. Performancesis assessed using two measures: percentage of false accepts and the percentage of false rejects. A good system shoul... |

43 | Face Recognition Using Kernel Eigenfaces
- Yang, Ahuja, et al.
- 2000
(Show Context)
Citation Context ...thod [3] is also linear, but trained discriminatively so as to maximize the ratio of inter-class and intra-class variances. Nonlinear extensions based on Kernel-PCA and Kernel-LDA have been discussed =-=[5]-=-. See [14] for a review of subspace methods for face recognition. One major shortcoming of all those approaches is that they are very sensitive to geometric transformations of the input images (shift,... |

31 | Signature verification using a ”siamese” time delay neural network
- Bromley, Bentz, et al.
- 1993
(Show Context)
Citation Context ...n differentiability with respect to � . Because the same function ¥ with the same parameter � is used to process bothsinputs, the similarity metric is symmetric. This is called a siamese architecture =-=[4]-=-. To build a face verification system with this method, we first train the model to produce output vectors that are nearby for pairs of images from the same person, and far away for pairs of images fr... |

30 | Loss functions for discriminative training of energy-based models
- LeCun, Huang
- 2005
(Show Context)
Citation Context ...le probabilistic models assign a normalized probability to every possible configuration of the variables being modeled, energy-based models (EBM) assign an unnormalized energy to those configurations =-=[18, 9]-=-. Prediction in such systems is performed by searching for configurations of the variables that minimize the energy. EBMs are used in situations where the energies for various configurations must be c... |

8 |
Distortion-invariant object recognition in the dynamic link architecture
- Lades, Vorbruggen, et al.
- 1993
(Show Context)
Citation Context ...metrics that are locally invariant to a set of known transformations. One example is the Tangent Distance method [19]. Another example, which has been applied to face recognition, is elastic matching =-=[6]-=-. Others have advocated warping-based normalization algorithms to maximally reduce the variations of appearance due to pose [10]. The invariance properties of all these models are hand-designed in adv... |

5 | A neural support vector network architecture with adaptive kernels - Vincent, Bengio - 2000 |

4 |
Handbook of Face Recognition, chapter Face Recognition in Subspaces
- Shakhnarovich, Moghaddam
- 2004
(Show Context)
Citation Context ...is also linear, but trained discriminatively so as to maximize the ratio of inter-class and intra-class variances. Nonlinear extensions based on Kernel-PCA and Kernel-LDA have been discussed [5]. See =-=[14]-=- for a review of subspace methods for face recognition. One major shortcoming of all those approaches is that they are very sensitive to geometric transformations of the input images (shift, scaling, ... |

1 |
The earth mover’s distance, multi-dimenional scaling, and color-based image retrieval
- Rubner, Guibas, et al.
- 1997
(Show Context)
Citation Context ...s derived from the discriminative learning framework for energy-based models (EBM). Our method is very different from other dimensionality reduction techniques such as Multi-Dimensional Scaling (MDS) =-=[13]-=- and Local Linear Embedding (LLE) [15] MDS computes a target vector from each input object in the training set based on known pairwise dissimilarities, without constructing a mapping. By contrast, our... |