Results 11  20
of
189
LowRank Kernel Learning with Bregman Matrix Divergences
"... In this paper, we study lowrank matrix nearness problems, with a focus on learning lowrank positive semidefinite (kernel) matrices for machine learning applications. We propose efficient algorithms that scale linearly in the number of data points and quadratically in the rank of the input matrix. E ..."
Abstract

Cited by 45 (2 self)
 Add to MetaCart
In this paper, we study lowrank matrix nearness problems, with a focus on learning lowrank positive semidefinite (kernel) matrices for machine learning applications. We propose efficient algorithms that scale linearly in the number of data points and quadratically in the rank of the input matrix. Existing algorithms for learning kernel matrices often scale poorly, with running times that are cubic in the number of data points. We employ Bregman matrix divergences as the measures of nearness—these divergences are natural for learning lowrank kernels since they preserve rank as well as positive semidefiniteness. Special cases of our framework yield faster algorithms for various existing learning problems, and experimental results demonstrate that our algorithms can effectively learn both lowrank and fullrank kernel matrices.
A survey on metric learning for feature vectors and structured data. arXiv preprint arXiv:1306.6709
, 2013
"... ar ..."
Metric and Kernel Learning Using a Linear Transformation
"... Metric and kernel learning arise in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over lowdimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new ..."
Abstract

Cited by 30 (2 self)
 Add to MetaCart
(Show Context)
Metric and kernel learning arise in several machine learning applications. However, most existing metric learning algorithms are limited to learning metrics over lowdimensional data, while existing kernel learning algorithms are often limited to the transductive setting and do not generalize to new data points. In this paper, we study the connections between metric learning and kernel learning that arise when studying metric learning as a linear transformation learning problem. In particular, we propose a general optimization framework for learning metrics via linear transformations, and analyze in detail a special case of our framework—that of minimizing the LogDet divergence subject to linear constraints. We then propose a general regularized framework for learning a kernel matrix, and show it to be equivalent to our metric learning framework. Our theoretical connections between metric and kernel learning have two main consequences: 1) the learned kernel matrix parameterizes a linear transformation kernel function and can be applied inductively to new data points, 2) our result yields a constructive method for kernelizing most existing Mahalanobis metric learning formulations. We demonstrate our learning approach by applying it to largescale real world problems in computer vision, text mining and semisupervised kernel dimensionality reduction. Keywords: divergence metric learning, kernel learning, linear transformation, matrix divergences, logdet 1.
A boosting framework for visualitypreserving distance metric learning and its application to medical image retrieval
 IEEE TPAMI
, 2010
"... Abstract—Similarity measurement is a critical component in contentbased image retrieval systems, and learning a good distance metric can significantly improve retrieval performance. However, despite extensive study, there are several major shortcomings with the existing approaches for distance metr ..."
Abstract

Cited by 30 (5 self)
 Add to MetaCart
(Show Context)
Abstract—Similarity measurement is a critical component in contentbased image retrieval systems, and learning a good distance metric can significantly improve retrieval performance. However, despite extensive study, there are several major shortcomings with the existing approaches for distance metric learning that can significantly affect their application to medical image retrieval. In particular, “similarity ” can mean very different things in image retrieval: resemblance in visual appearance (e.g., two images that look like one another) or similarity in semantic annotation (e.g., two images of tumors that look quite different yet are both malignant). Current approaches for distance metric learning typically address only one goal without consideration of the other. This is problematic for medical image retrieval where the goal is to assist doctors in decision making. In these applications, given a query image, the goal is to retrieve similar images from a reference library whose semantic annotations could provide the medical professional with greater insight into the possible interpretations of the query image. If the system were to retrieve images that did not look like the query, then users would be less likely to trust the system; on the other hand, retrieving images that appear superficially similar to the query but are semantically unrelated is undesirable because that could lead users toward an incorrect diagnosis. Hence, learning a distance metric that preserves both visual resemblance and semantic similarity is important. We emphasize that, although our study is focused on medical image retrieval, the problem addressed in this work is critical to many image retrieval systems. We present a boosting framework for distance metric learning that aims to preserve both visual and semantic similarities. The boosting framework first learns a binary representation using side information, in the form of labeled pairs, and then computes the distance as a weighted Hamming
Adaptively Learning the Crowd Kernel
"... We introduce an algorithm that, given n objects, learns a similarity matrix over all n2 pairs, from crowdsourced data alone. The algorithm samples responses to adaptively chosen tripletbased relativesimilarity queries. Each query has the form “is object a more similar to b or to c? ” and is chosen ..."
Abstract

Cited by 30 (3 self)
 Add to MetaCart
(Show Context)
We introduce an algorithm that, given n objects, learns a similarity matrix over all n2 pairs, from crowdsourced data alone. The algorithm samples responses to adaptively chosen tripletbased relativesimilarity queries. Each query has the form “is object a more similar to b or to c? ” and is chosen to be maximally informative given the preceding responses. The output is an embedding of the objects into Euclidean space (like MDS); we refer to this as the “crowd kernel. ” SVMs reveal that the crowd kernel captures prominent and subtle features across a number of domains, such as “is striped ” among neckties and “vowel vs. consonant ” among letters. 1.
Learning Pairwise Dissimilarity Profiles for Appearance Recognition in Visual Surveillance
"... Abstract. Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity p ..."
Abstract

Cited by 28 (1 self)
 Add to MetaCart
(Show Context)
Abstract. Training discriminative classifiers for a large number of classes is a challenging problem due to increased ambiguities between classes. In order to better handle the ambiguities and to improve the scalability of classifiers to larger number of categories, we learn pairwise dissimilarity profiles (functions of spatial location) between categories and adapt them into nearest neighbor classification. We introduce a dissimilarity distance measure and linearly or nonlinearly combine it with direct distances. We illustrate and demonstrate the approach mainly in the context of appearancebased person recognition. 1
Generalized nonmetric multidimensional scaling
 In Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics
, 2007
"... We consider the nonmetric multidimensional scaling problem: given a set of dissimilarities ∆, find an embedding whose interpoint Euclidean distances have the same ordering as ∆. In this paper, we look at a generalization of this problem in which only a set of order relations of the form dij < d ..."
Abstract

Cited by 26 (9 self)
 Add to MetaCart
(Show Context)
We consider the nonmetric multidimensional scaling problem: given a set of dissimilarities ∆, find an embedding whose interpoint Euclidean distances have the same ordering as ∆. In this paper, we look at a generalization of this problem in which only a set of order relations of the form dij < dkl are provided. Unlike the original problem, these order relations can be contradictory and need not be specified for all pairs of dissimilarities. We argue that this setting is more natural in some experimental settings and propose an algorithm based on convex optimization techniques to optimally solve this problem. We apply this algorithm to human subject data from a psychophysics experiment concerning how reflectance properties are perceived. We also look at the standard NMDS problem, where a dissimilarity matrix ∆ is provided as input, and show that we can always find an orderrespecting embedding of ∆. 1
Heterogeneous embedding for subjective artist similarity
 In Tenth International Symposium for Music Information Retrieval (ISMIR2009
, 2009
"... We describe an artist recommendation system which integrates several heterogeneous data sources to form a holistic similarity space. Using social, semantic, and acoustic features, we learn a lowdimensional feature transformation which is optimized to reproduce humanderived measurements of subjecti ..."
Abstract

Cited by 23 (2 self)
 Add to MetaCart
(Show Context)
We describe an artist recommendation system which integrates several heterogeneous data sources to form a holistic similarity space. Using social, semantic, and acoustic features, we learn a lowdimensional feature transformation which is optimized to reproduce humanderived measurements of subjective similarity between artists. By producing lowdimensional representations of artists, our system is suitable for visualization and recommendation tasks. 1.
Learning multiview neighborhood preserving projections
 In Proc. of the the International Conference on Machine Learning (ICML
, 2011
"... We address the problem of metric learning for multiview data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful withinview as well as betweenview similarity. Our ..."
Abstract

Cited by 22 (4 self)
 Add to MetaCart
We address the problem of metric learning for multiview data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful withinview as well as betweenview similarity. Our motivation stems from the problem of crossmedia retrieval tasks, where the availability of a joint Euclidean distance function is a prerequisite to allow fast, in particular hashingbased, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas nonmatching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convexconcave procedure. Experiments on an image retrieval task show that nearestneighbor based crossview retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques. 1.
Creating a Cluster Hierarchy under Constraints of a Partially Known Hierarchy Abstract
"... Although clustering under constraints is a current research topic, a hierarchical setting, in which a hierarchy of clusters is the goal, is usually not considered. This paper tries to fill this gap by analyzing a scenario, where constraints are derived from a hierarchy that is partially known in adv ..."
Abstract

Cited by 21 (7 self)
 Add to MetaCart
(Show Context)
Although clustering under constraints is a current research topic, a hierarchical setting, in which a hierarchy of clusters is the goal, is usually not considered. This paper tries to fill this gap by analyzing a scenario, where constraints are derived from a hierarchy that is partially known in advance. This scenario can be found, e.g., when structuring a collection of documents according to a user specific hierarchy. Major issues of current approaches to constraint based clustering are discussed, especially towards the hierarchical setting. We introduce the concept of hierarchical constraints and continue by presenting and evaluating two approaches using them. The approaches cover the two major fields of constraint based clustering, i.e. instance and metric based constraint integration. Our objects of interest are text documents. Therefore, the presented algorithms are especially fitted to work for these where necessary. Despite showing the properties and ideas of the algorithms in general, we evaluated the case of constraints that are unevenly scattered over the instance space, which is very common for realworld problems but not satisfyingly covered in other work so far. 1