• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations

Distance metric learning, with application to clustering with side information. NIPS (2002)

by E P Xing, A Y Ng, M I Jordan, S Russell
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 818
Next 10 →

Distance metric learning for large margin nearest neighbor classification

by Kilian Q. Weinberger, John Blitzer, Lawrence K. Saul - In NIPS , 2006
"... We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven ..."
Abstract - Cited by 695 (14 self) - Add to MetaCart
We show how to learn a Mahanalobis distance metric for k-nearest neighbor (kNN) classification by semidefinite programming. The metric is trained with the goal that the k-nearest neighbors always belong to the same class while examples from different classes are separated by a large margin. On seven data sets of varying size and difficulty, we find that metrics trained in this way lead to significant improvements in kNN classification—for example, achieving a test error rate of 1.3 % on the MNIST handwritten digits. As in support vector machines (SVMs), the learning problem reduces to a convex optimization based on the hinge loss. Unlike learning in SVMs, however, our framework requires no modification or extension for problems in multiway (as opposed to binary) classification. 1
(Show Context)

Citation Context

...es are separated by a large margin. Our goal for metric learning differs in a crucial way from those of previous approaches that minimize the pairwise distances between all similarly labeled examples =-=[12, 13, 17]-=-. This latter objective is far more difficult to achieve and does not leverage the full power of kNN classification, whose accuracy does not require that all similarly labeled inputs be tightly cluste...

Image retrieval: ideas, influences, and trends of the new age

by Ritendra Datta, Dhiraj Joshi, Jia Li, James Z. Wang - ACM COMPUTING SURVEYS , 2008
"... We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger ass ..."
Abstract - Cited by 485 (13 self) - Add to MetaCart
We have witnessed great interest and a wealth of promise in content-based image retrieval as an emerging technology. While the last decade laid foundation to such promise, it also paved the way for a large number of new techniques and systems, got many new people involved, and triggered stronger association of weakly related fields. In this article, we survey almost 300 key theoretical and empirical contributions in the current decade related to image retrieval and automatic image annotation, and in the process discuss the spawning of related subfields. We also discuss significant challenges involved in the adaptation of existing image retrieval techniques to build systems that can be useful in the real world. In retrospect of what has been achieved so far, we also conjecture what the future may hold for image retrieval research.
(Show Context)

Citation Context

...been explored for various task such as clustering and classification. One way to achieve this is to learn a generalized Mahalanobis distance metric, such as those general-purpose methods proposed in [=-=Xing et al. 2003-=-; Bar-hillel et al. 2005]. On the other hand, kernel-based learning of image similarity, using context information, with applications to image clustering was explored in [Wu et al. 2005]. This could p...

Online passive-aggressive algorithms

by Koby Crammer, Ofer Dekel, Shai Shalev-Shwartz, Yoram Singer - JMLR , 2006
"... We present a unified view for online classification, regression, and uniclass problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. The end result is new alg ..."
Abstract - Cited by 435 (24 self) - Add to MetaCart
We present a unified view for online classification, regression, and uniclass problems. This view leads to a single algorithmic framework for the three problems. We prove worst case loss bounds for various algorithms for both the realizable case and the non-realizable case. The end result is new algorithms and accompanying loss bounds for hinge-loss regression and uniclass. We also get refined loss bounds for previously studied classification algorithms.

Mixed membership stochastic block models for relational data with application to protein-protein interactions

by Edoardo M. Airoldi, David M. Blei, Stephen E. Fienberg, Eric P. Xing, Tommi Jaakkola - In Proceedings of the International Biometrics Society Annual Meeting , 2006
"... We develop a model for examining data that consists of pairwise measurements, for example, presence or absence of links between pairs of objects. Examples include protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with p ..."
Abstract - Cited by 378 (52 self) - Add to MetaCart
We develop a model for examining data that consists of pairwise measurements, for example, presence or absence of links between pairs of objects. Examples include protein interactions and gene regulatory networks, collections of author-recipient email, and social networks. Analyzing such data with probabilistic models requires special assumptions, since the usual independence or exchangeability assumptions no longer hold. We introduce a class of latent variable models for pairwise measurements: mixed membership stochastic blockmodels. Models in this class combine a global model of dense patches of connectivity (blockmodel) and a local model to instantiate nodespecific variability in the connections (mixed membership). We develop a general variational inference algorithm for fast approximate posterior inference. We demonstrate the advantages of mixed membership stochastic blockmodels with applications to social networks and protein interaction networks.
(Show Context)

Citation Context

...), text and images (Barnard et al., 2003), multiple disability measures (Erosheva and Fienberg, 2005; Manton 2et al., 1994), and genetics information (Rosenberg et al., 2002; Pritchard et al., 2000; =-=Xing et al., 2003-=-). These models use a simple generative model, such as bag-of-words or naive Bayes, embedded in a hierarchical Bayesian framework involving a latent variable structure; this induces dependencies and i...

Information-theoretic metric learning

by Jason Davis, Brian Kulis, Suvrit Sra, Inderjit Dhillon - in NIPS 2006 Workshop on Learning to Compare Examples , 2007
"... We formulate the metric learning problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the Mahalanobis distance function. Via a surprising equivalence, we show that this problem can be solved as a low-rank kernel learning problem. Spe ..."
Abstract - Cited by 359 (15 self) - Add to MetaCart
We formulate the metric learning problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the Mahalanobis distance function. Via a surprising equivalence, we show that this problem can be solved as a low-rank kernel learning problem. Specifically, we minimize the Burg divergence of a low-rank kernel to an input kernel, subject to pairwise distance constraints. Our approach has several advantages over existing methods. First, we present a natural information-theoretic formulation for the problem. Second, the algorithm utilizes the methods developed by Kulis et al. [6], which do not involve any eigenvector computation; in particular, the running time of our method is faster than most existing techniques. Third, the formulation offers insights into connections between metric learning and kernel learning. 1
(Show Context)

Citation Context

...s the number of distance constraints, and d is the dimensionality of the data. In particular, this method does not require costly eigenvalue computations, unlike many other metric learning algorithms =-=[4, 10, 11]-=-. 2 Problem Formulation Given a set of n points {x1,...,xn} in ℜ d , we seek a positive definite matrix A which parameterizes the Mahalanobis distance: dA(xi,xj) = (xi − xj) T A(xi − xj). We assume th...

A Probabilistic Framework for Semi-Supervised Clustering

by Sugato Basu , 2004
"... Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clusters. In recent years, a number of algorithms have been proposed for enhancing clustering quality by employing such supe ..."
Abstract - Cited by 275 (14 self) - Add to MetaCart
Unsupervised clustering can be significantly improved using supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clusters. In recent years, a number of algorithms have been proposed for enhancing clustering quality by employing such supervision. Such methods use the constraints to either modify the objective function, or to learn the distance measure. We propose a probabilistic model for semisupervised clustering based on Hidden Markov Random Fields (HMRFs) that provides a principled framework for incorporating supervision into prototype-based clustering. The model generalizes a previous approach that combines constraints and Euclidean distance learning, and allows the use of a broad range of clustering distortion measures, including Bregman divergences (e.g., Euclidean distance and I-divergence) and directional similarity measures (e.g., cosine similarity). We present an algorithm that performs partitional semi-supervised clustering of data by minimizing an objective function derived from the posterior energy of the HMRF model. Experimental results on several text data sets demonstrate the advantages of the proposed framework. 1.
(Show Context)

Citation Context

...work on distance-based semi-supervised clustering with pairwise constraints, Cohn et al. [13] used gradient descent for weighted Jensen-Shannon divergence in the context of EM clustering. Xing et al. =-=[39]-=- utilized a combination of gradient descent and iterative projections to learn a Mahalanobis distance for K-Means clustering. The Redundant Component Analysis (RCA) algorithm used only must-link const...

Learning structured prediction models: A large margin approach

by Ben Taskar , 2004
"... ..."
Abstract - Cited by 231 (8 self) - Add to MetaCart
Abstract not found

Metric Learning by Collapsing Classes

by Amir Globerson, Sam Roweis
"... We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in th ..."
Abstract - Cited by 230 (2 self) - Add to MetaCart
We present an algorithm for learning a quadratic Gaussian metric (Mahalanobis distance) for use in classification tasks. Our method relies on the simple geometric intuition that a good metric is one under which points in the same class are simultaneously near each other and far from points in the other classes. We construct a convex optimization problem whose solution generates such a metric by trying to collapse all examples in the same class to a single point and push examples in other classes infinitely far away. We show that when the metric we learn is used in simple classifiers, it yields substantial improvements over standard alternatives on a variety of problems. We also discuss how the learned metric may be used to obtain a compact low dimensional feature representation of the original input space, allowing more efficient classification with very little reduction in performance.

Learning a distance metric from relative comparisons

by Matthew Schultz, Thorsten Joachims - In Proc. Advances in Neural Information Processing Systems , 2003
"... This paper presents a method for learning a distance metric from rel-ative comparison such as “A is closer to B than A is to C”. Taking a Support Vector Machine (SVM) approach, we develop an algorithm that provides a flexible way of describing qualitative training data as a set of constraints. We sh ..."
Abstract - Cited by 195 (0 self) - Add to MetaCart
This paper presents a method for learning a distance metric from rel-ative comparison such as “A is closer to B than A is to C”. Taking a Support Vector Machine (SVM) approach, we develop an algorithm that provides a flexible way of describing qualitative training data as a set of constraints. We show that such constraints lead to a convex quadratic programming problem that can be solved by adapting standard meth-ods for SVM training. We empirically evaluate the performance and the modelling flexibility of the algorithm on a collection of text documents. 1
(Show Context)

Citation Context

...e between A and B is 7.35”) as considered in metric Multidimensional Scaling (MDS) (see [4]), or absolute qualitative feedback (e.g. “A and B are similar”, “A and C are not similar”) as considered in =-=[11]-=-. Building on the study in [7], search-engine query logs are one example where feedback of the form “A is closer to B than A is to C” is readily available for learning a (more semantic) similarity met...

Is that you? Metric learning approaches for face identification

by Matthieu Guillaumin, Jakob Verbeek, Cordelia Schmid - In ICCV , 2009
"... Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a log ..."
Abstract - Cited by 159 (8 self) - Add to MetaCart
Face identification is the problem of determining whether two face images depict the same person or not. This is difficult due to variations in scale, pose, lighting, background, expression, hairstyle, and glasses. In this paper we present two methods for learning robust distance measures: (a) a logistic discriminant approach which learns the metric from a set of labelled image pairs (LDML) and (b) a nearest neighbour approach which computes the probability for two images to belong to the same class (MkNN). We evaluate our approaches on the Labeled Faces in the Wild data set, a large and very challenging data set of faces from Yahoo! News. The evaluation protocol for this data set defines a restricted setting, where a fixed set of positive and negative image pairs is given, as well as an unrestricted one, where faces are labelled by their identity. We are the first to present results for the unrestricted setting, and show that our methods benefit from this richer training data, much more so than the current state-of-the-art method. Our results of 79.3 % and 87.5 % correct for the restricted and unrestricted setting respectively, significantly improve over the current state-of-the-art result of 78.5%. Confidence scores obtained for face identification can be used for many applications e.g. clustering or recognition from a single training example. We show that our learned metrics also improve performance for these tasks. 1.
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University