Results 1 - 10
of
12
Distance metric learning for large margin nearest neighbor classification
- In NIPS
, 2006
"... We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven ..."
Abstract
-
Cited by 177 (7 self)
- Add to MetaCart
We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification—for example, achieving a test error rate of 1.3 % on the MNIST handwritten digits. As in support vector machines (SVMs), the learning problem reduces to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our framework requires no modification or extension for problems in multiway (as opposed to binary) classification. 1
Fast solvers and efficient implementations for distance metric learning
- In ICML
, 2008
"... In this paper we study how to improve nearest neighbor classification by learning a Mahalanobis distance metric. We build on a recently proposed framework for distance metric learning known as large margin nearest neighbor (LMNN) classification. Our paper makes three contributions. First, we describ ..."
Abstract
-
Cited by 18 (4 self)
- Add to MetaCart
In this paper we study how to improve nearest neighbor classification by learning a Mahalanobis distance metric. We build on a recently proposed framework for distance metric learning known as large margin nearest neighbor (LMNN) classification. Our paper makes three contributions. First, we describe a highly efficient solver for the particular instance of semidefinite programming that arises in LMNN classification; our solver can handle problems with billions of large margin constraints in a few hours. Second, we show how to reduce both training and testing times using metric ball trees; the speedups from ball trees are further magnified by learning low dimensional representations of the input space. Third, we show how to learn different Mahalanobis distance metrics in different parts of the input space. For large data sets, the use of locally adaptive distance metrics leads to even lower error rates. 1.
Learning to reduce the semantic gap in Web image retrieval and annotation
- In SIGIR '08
, 2008
"... We study in this paper the problem of bridging the semantic gap between low-level image features and high-level semantic concepts, which is the key hindrance in content-based image retrieval. Piloted by the rich textual information of Web images, the proposed framework tries to learn a new distance ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
We study in this paper the problem of bridging the semantic gap between low-level image features and high-level semantic concepts, which is the key hindrance in content-based image retrieval. Piloted by the rich textual information of Web images, the proposed framework tries to learn a new distance measure in the visual space, which can be used to retrieve more semantically relevant images for any unseen query image. The framework differentiates with traditional distance metric learning methods in the following ways. 1) A ranking-based distance metric learning method is proposed for image retrieval problem, by optimizing the leave-one-out retrieval performance on the training data. 2) To be scalable, millions of images together with rich textual information have been crawled from the Web to learn the similarity measure, and the learning framework particularly considers the indexing problem to ensure the retrieval efficiency. 3) To alleviate the noises in the unbalanced labels of images and fully utilize the textual information, a Latent Dirichlet Allocation based topiclevel text model is introduced to define pairwise semantic similarity between any two images. The learnt distance measure can be directly applied to applications such as content-based image retrieval and search-based image annotation. Experimental results on the two applications in a two million Web image database show both the effectiveness and efficiency of the proposed framework.
Contextual Identity Recognition in Personal Photo Albums
"... We present an efficient probabilistic method for identity recognition in personal photo albums. Personal photos are usually taken under uncontrolled conditions – the captured faces exhibit significant variations in pose, expression and illumination that limit the success of traditional face recognit ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
We present an efficient probabilistic method for identity recognition in personal photo albums. Personal photos are usually taken under uncontrolled conditions – the captured faces exhibit significant variations in pose, expression and illumination that limit the success of traditional face recognition algorithms. We show how to improve recognition rates by incorporating additional cues present in personal photo collections, such as clothing appearance and information about when the photo was taken. This is done by constructing a Markov Random Field (MRF) that effectively combines all available contextual cues in a principled recognition framework. Performing inference in the MRF produces markedly improved recognition results in a challenging dataset consisting of the personal photo collections of multiple people. At the same time, the computational cost of our approach remains comparable to that of standard face recognition approaches. 1.
S.: Local distance functions: A taxonomy, new algorithms, and an evaluation
- In: Proc. ICCV (2009
"... We present a taxonomy for local distance functions where most existing algorithms can be regarded as approximations of the geodesic distance defined by a metric tensor. We categorize existing algorithms by how, where and when they estimate the metric tensor. We also extend the taxonomy along each ax ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
We present a taxonomy for local distance functions where most existing algorithms can be regarded as approximations of the geodesic distance defined by a metric tensor. We categorize existing algorithms by how, where and when they estimate the metric tensor. We also extend the taxonomy along each axis. How: We introduce hybrid algorithms that use a combination of dimensionality reduction and metric learning to ameliorate over-fitting. Where: We present an exact polynomial time algorithm to integrate the metric tensor along the lines between the test and training points under the assumption that the metric tensor is piecewise constant. When: We propose an interpolation algorithm where the metric tensor is sampled at a number of references points during the offline phase, which are then interpolated during online classification. We also present a comprehensive evaluation of all the algorithms on tasks in face recognition, object recognition, and digit recognition. 1.
Low-Rank Kernel Learning with Bregman Matrix Divergences
"... In this paper, we study low-rank matrix nearness problems, with a focus on learning lowrank positive semidefinite (kernel) matrices for machine learning applications. We propose efficient algorithms that scale linearly in the number of data points and quadratically in the rank of the input matrix. E ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In this paper, we study low-rank matrix nearness problems, with a focus on learning lowrank positive semidefinite (kernel) matrices for machine learning applications. We propose efficient algorithms that scale linearly in the number of data points and quadratically in the rank of the input matrix. Existing algorithms for learning kernel matrices often scale poorly, with running times that are cubic in the number of data points. We employ Bregman matrix divergences as the measures of nearness—these divergences are natural for learning low-rank kernels since they preserve rank as well as positive semidefiniteness. Special cases of our framework yield faster algorithms for various existing learning problems, and experimental results demonstrate that our algorithms can effectively learn both low-rank and full-rank kernel matrices.
Convex Perturbations for Scalable Semidefinite Programming
"... Many important machine learning problems are modeled and solved via semidefinite programs; examples include metric learning, nonlinear embedding, and certain clustering problems. Often, off-the-shelf software is invoked for the associated optimization, which can be inappropriate due to excessive com ..."
Abstract
-
Cited by 3 (2 self)
- Add to MetaCart
Many important machine learning problems are modeled and solved via semidefinite programs; examples include metric learning, nonlinear embedding, and certain clustering problems. Often, off-the-shelf software is invoked for the associated optimization, which can be inappropriate due to excessive computational and storage requirements. In this paper, we introduce the use of convex perturbations for solving semidefinite programs (SDPs), and for a specific perturbation we derive an algorithm that has several advantages over existing techniques: a) it is simple, requiring only a few lines of MATLAB, b) it is a first-order method, and thereby scalable, and c) it can easily exploit the structure of a given SDP (e.g., when the constraint matrices are low-rank, a situation common to several machine learning SDPs). A pleasant byproduct of our method is a fast, kernelized version of the large-margin nearest neighbor metric learning algorithm (Weinberger et al., 2005). We demonstrate that our algorithm is effective in finding fast approximations to large-scale SDPs arising in some machine learning applications. 1
Metric Embedding for Kernel Classification Rules
"... In this paper, we consider a smoothing kernel based classification rule and propose an algorithm for optimizing the performance of the rule by learning the bandwidth of the smoothing kernel along with a data-dependent distance metric. The data-dependent distance metric is obtained by learning a func ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
In this paper, we consider a smoothing kernel based classification rule and propose an algorithm for optimizing the performance of the rule by learning the bandwidth of the smoothing kernel along with a data-dependent distance metric. The data-dependent distance metric is obtained by learning a function that embeds an arbitrary metric space into a Euclidean space while minimizing an upper bound on the resubstitution estimate of the error probability of the kernel classification rule. By restricting this embedding function to a reproducing kernel Hilbert space, we reduce the problem to solving a semidefinite program and show the resulting kernel classification rule to be a variation of the k-nearest neighbor rule. We compare the performance of the kernel rule (using the learned data-dependent distance metric) to state-of-the-art distance metric learning algorithms (designed for k-nearest neighbor classification) on some benchmark datasets. The results show that the proposed rule has either better or as good classification accuracy as the other metric learning algorithms. 1.
SEMI-SUPERVISED DISTANCE METRIC LEARNING FOR VISUAL OBJECT CLASSIFICATION
"... Dimensionality reduction, image segmentation, metric learning, pairwise constraints, semi-supervised learning, visual object classification. This paper describes a semi-supervised distance metric learning algorithm which uses pairwise equivalence (similarity and dissimilarity) constraints to discove ..."
Abstract
- Add to MetaCart
Dimensionality reduction, image segmentation, metric learning, pairwise constraints, semi-supervised learning, visual object classification. This paper describes a semi-supervised distance metric learning algorithm which uses pairwise equivalence (similarity and dissimilarity) constraints to discover the desired groups within high-dimensional data. As opposed to the traditional full rank distance metric learning algorithms, the proposed method can learn nonsquare projection matrices that yield low rank distance metrics. This brings additional benefits such as visualization of data samples and reducing the storage cost, and it is more robust to overfitting since the number of estimated parameters is greatly reduced. Our method works in both the input and kernel induced-feature space, and the distance metric is found by a gradient descent procedure that involves an eigen-decomposition in each step. Experimental results on high-dimensional visual object classification problems show that the computed distance metric improves the performance of the subsequent clustering algorithm. 1
Convex Optimizations for Distance Metric Learning and Pattern Classification
"... The goal of machine learning is to build automated systems that can classify and recognize complex patterns in data. Not surprisingly, the representation of the data plays an important role in determining what types of patterns can be automatically discovered. Many algorithms for machine learning as ..."
Abstract
- Add to MetaCart
The goal of machine learning is to build automated systems that can classify and recognize complex patterns in data. Not surprisingly, the representation of the data plays an important role in determining what types of patterns can be automatically discovered. Many algorithms for machine learning assume that the data are represented as elements in a metric space. For example, in popular algorithms such as nearest-neighbor classification, vector quantization, and kernel density estimation, the metric distances between different examples provide a measure of their dissimilarity [1]. The performance of these algorithms can depend sensitively on the manner in which distances are measured. When data are represented as points in a multidimensional vector space, simple Euclidean distances are often used to measure the dissimilarity between different examples. However, such distances often do not yield reliable judgments; in addition, they cannot highlight the distinctive features that play a role in certain types of classification, but not others. For example, consider two schemes for clustering images of faces: one by age, one by gender. Images can be represented as points in a multidimensional vector space in many ways—for example, by enumerating their pixel values, or by computing color histograms. However the images are represented, different components of these feature vectors are likely to be relevant for clustering by age versus clustering by gender. Naturally, for these different types of clustering, we need different ways of measuring dissimilarity; in particular, we need different metrics for computing distances between feature vectors. This article describes two algorithms for learning such distance metrics based on recent developments in convex optimization. 1

