Results 11 - 20
of
150
The support vector decomposition machine
- In Proceedings of the International Conference on Machine Learning (ICML
, 2006
"... In machine learning problems with tens of thousands of features and only dozens or hundreds of independent training examples, dimensionality reduction is essential for good learning performance. In previous work, many researchers have treated the learning problem in two separate phases: first use an ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
In machine learning problems with tens of thousands of features and only dozens or hundreds of independent training examples, dimensionality reduction is essential for good learning performance. In previous work, many researchers have treated the learning problem in two separate phases: first use an algorithm such as singular value decomposition to reduce the dimensionality of the data set, and then use a classification algorithm such as naïve Bayes or support vector machines to learn a classifier. We demonstrate that it is possible to combine the two goals of dimensionality reduction and classification into a single learning objective, and present a novel and efficient algorithm which optimizes this objective directly. We present experimental results in fMRI analysis which show that we can achieve better learning performance and lower-dimensional representations than two-phase approaches can. 1.
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization
"... Nonlinear dimensionality reduction methods are often used to visualize high-dimensional data, although the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been well-defined. We ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Nonlinear dimensionality reduction methods are often used to visualize high-dimensional data, although the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been well-defined. We give a rigorous definition for a specific visualization task, resulting in quantifiable goodness measures and new visualization methods. The task is information retrieval given the visualization: to find similar data based on the similarities shown on the display. The fundamental tradeoff between precision and recall of information retrieval can then be quantified in visualizations as well. The user needs to give the relative cost of missing similar points vs. retrieving dissimilar points, after which the total cost can be measured. We then introduce a new method NeRV (neighbor retrieval visualizer) which produces an optimal visualization by minimizing the cost. We further derive a variant for supervised visualization; class information is taken rigorously into account when computing the similarity relationships. We show empirically that the unsupervised version outperforms existing unsupervised dimensionality reduction methods in the visualization task, and the supervised version outperforms existing supervised methods.
An efficient algorithm for local distance metric learning
- in Proceedings of AAAI
, 2006
"... Learning application-specific distance metrics from labeled data is critical for both statistical classification and information retrieval. Most of the earlier work in this area has focused on finding metrics that simultaneously optimize compactness and separability in a global sense. Specifically, ..."
Abstract
-
Cited by 12 (6 self)
- Add to MetaCart
Learning application-specific distance metrics from labeled data is critical for both statistical classification and information retrieval. Most of the earlier work in this area has focused on finding metrics that simultaneously optimize compactness and separability in a global sense. Specifically, such distance metrics attempt to keep all of the data points in each class close together while ensuring that data points from different classes are separated. However, particularly when classes exhibit multimodal data distributions, these goals conflict and thus cannot be simultaneously satisfied. This paper proposes a Local Distance Metric (LDM) that aims to optimize local compactness and local separability. We present an efficient algorithm that employs eigenvector analysis and bound optimization to learn the LDM from training data in a probabilistic framework. We demonstrate that LDM achieves significant improvements in both classification and retrieval accuracy compared to global distance learning and kernel-based KNN.
Descriptor Learning for Efficient Retrieval
"... Abstract. Many visual search and matching systems represent images using sparse sets of “visual words”: descriptors that have been quantized by assignment to the best-matching symbol in a discrete vocabulary. Errors in this quantization procedure propagate throughout the rest of the system, either h ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Abstract. Many visual search and matching systems represent images using sparse sets of “visual words”: descriptors that have been quantized by assignment to the best-matching symbol in a discrete vocabulary. Errors in this quantization procedure propagate throughout the rest of the system, either harming performance or requiring correction using additional storage or processing. This paper aims to reduce these quantization errors at source, by learning a projection from descriptor space to a new Euclidean space in which standard clustering techniques are more likely to assign matching descriptors to the same cluster, and non-matching descriptors to different clusters. To achieve this, we learn a non-linear transformation model by minimizing a novel margin-based cost function, which aims to separate matching descriptors from two classes of non-matching descriptors. Training data is generated automatically by leveraging geometric consistency. Scalable, stochastic gradient methods are used for the optimization. For the case of particular object retrieval, we demonstrate impressive gains in performance on a ground truth dataset: our learnt 32-D descriptor without spatial re-ranking outperforms a baseline method using 128-D SIFT descriptors with spatial re-ranking. 1
Similarity Scores based on Background Samples
"... Abstract. Evaluating the similarity of images and their descriptors by employing discriminative learners has proven itself to be an effective face recognition paradigm. In this paper we show how “background samples”, that is, examples which do not belong to any of the classes being learned, may prov ..."
Abstract
-
Cited by 12 (3 self)
- Add to MetaCart
Abstract. Evaluating the similarity of images and their descriptors by employing discriminative learners has proven itself to be an effective face recognition paradigm. In this paper we show how “background samples”, that is, examples which do not belong to any of the classes being learned, may provide a significant performance boost to such face recognition systems. In particular, we make the following contributions. First, we define and evaluate the “Two-Shot Similarity ” (TSS) score as an extension to the recently proposed “One-Shot Similarity ” (OSS) measure. Both these measures utilize background samples to facilitate better recognition rates. Second, we examine the ranking of images most similar to a query image and employ these as a descriptor for that image. Finally, we provide results underscoring the importance of proper face alignment in automatic face recognition systems. These contributions in concert allow us to obtain a success rate of 86.83 % on the Labeled Faces in the Wild (LFW) benchmark, outperforming current state-of-the-art results. 1
Large Scale Online Learning of Image Similarity through Ranking
"... Learning a measure of similarity between pairs of objects is an important generic problem in machine learning. It is particularly useful in large scale applications like searching for an image that is similar to a given image or finding videos that are relevant to a given video. In these tasks, user ..."
Abstract
-
Cited by 12 (2 self)
- Add to MetaCart
Learning a measure of similarity between pairs of objects is an important generic problem in machine learning. It is particularly useful in large scale applications like searching for an image that is similar to a given image or finding videos that are relevant to a given video. In these tasks, users look for objects that are not only visually similar but also semantically related to a given object. Unfortunately, the approaches that exist today for learning such semantic similarity do not scale to large datasets. This is both because typically their CPU and storage requirements grow quadratically with the sample size, and because many methods impose complex positivity constraints on the space of learned similarity functions. The current paper presents OASIS, an Online Algorithm for Scalable Image Similarity learning that learns a bilinear similarity measure over sparse representations. OASIS is an online dual approach using the passive-aggressive family of learning algorithms with a large margin criterion and an efficient hinge loss cost. Our experiments show that OASIS is both fast and accurate at a wide range of scales: for a dataset with thousands of images, it achieves better results than existing state-of-the-art methods, while being an order of
The one-shot similarity kernel
- In International Conference on Computer Vision (ICCV
, 2009
"... face.com The One-Shot similarity measure has recently been introduced in the context of face recognition where it was used to produce state-of-the-art results. Given two vectors, their One-Shot similarity score reflects the likelihood of each vector belonging in the same class as the other vector an ..."
Abstract
-
Cited by 11 (5 self)
- Add to MetaCart
face.com The One-Shot similarity measure has recently been introduced in the context of face recognition where it was used to produce state-of-the-art results. Given two vectors, their One-Shot similarity score reflects the likelihood of each vector belonging in the same class as the other vector and not in a class defined by a fixed set of “negative ” examples. The potential of this approach has thus far been largely unexplored. In this paper we analyze the One-Shot score and show that: (1) when using a version of LDA as the underlying classifier, this score is a Conditionally Positive Definite kernel and may be used within kernel-methods (e.g., SVM), (2) it can be efficiently computed, and (3) that it is effective as an underlying mechanism for image representation. We further demonstrate the effectiveness of the One-Shot similarity score in a number of applications including multiclass identification and descriptor generation. 1.
Adaptive relevance matrices in learning vector quantization
, 2009
"... We propose a new matrix learning scheme to extend relevance learning vector quantization (RLVQ), an efficient prototype-based classification algorithm, towards a general adaptive metric. By introducing a full matrix of relevance factors in the distance measure, correlations between different feature ..."
Abstract
-
Cited by 11 (9 self)
- Add to MetaCart
We propose a new matrix learning scheme to extend relevance learning vector quantization (RLVQ), an efficient prototype-based classification algorithm, towards a general adaptive metric. By introducing a full matrix of relevance factors in the distance measure, correlations between different features and their importance for the classification scheme can be taken into account and automated, general metric adaptation takes place during training. In comparison to the weighted Euclidean metric used in RLVQ and its variations, a full matrix is more powerful to represent the internal structure of the data appropriately. Large margin generalization bounds can be transfered to this case leading to bounds which are independent of the input dimensionality. This also holds for local metrics attached to each prototype which corresponds to piecewise quadratic decision boundaries. The algorithm is tested in comparison to alternative LVQ schemes using an artificial data set, a benchmark multi-class problem from the UCI repository, and a problem from bioinformatics, the recognition of splice sites for C.elegans.
Fast Nearest Neighbor Retrieval for Bregman Divergences
"... We present a data structure enabling efficient nearest neighbor (NN) retrieval for bregman divergences. The family of bregman divergences includes many popular dissimilarity measures including KL-divergence (relative entropy), Mahalanobis distance, and Itakura-Saito divergence. These divergences pre ..."
Abstract
-
Cited by 10 (1 self)
- Add to MetaCart
We present a data structure enabling efficient nearest neighbor (NN) retrieval for bregman divergences. The family of bregman divergences includes many popular dissimilarity measures including KL-divergence (relative entropy), Mahalanobis distance, and Itakura-Saito divergence. These divergences present a challenge for efficient NN retrieval because they are not, in general, metrics, for which most NN data structures are designed. The data structure introduced in this work shares the same basic structure as the popular metric ball tree, but employs convexity properties of bregman divergences in place of the triangle inequality. Experiments demonstrate speedups over brute-force search of up to several orders of magnitude. 1.
Generalized Non-metric Multidimensional Scaling
"... We consider the non-metric multidimensional scaling problem: given a set of dissimilarities ∆, find an embedding whose inter-point Euclidean distances have the same ordering as ∆. In this paper, we look at a generalization of this problem in which only a set of order relations of the form dij < dkl ..."
Abstract
-
Cited by 9 (6 self)
- Add to MetaCart
We consider the non-metric multidimensional scaling problem: given a set of dissimilarities ∆, find an embedding whose inter-point Euclidean distances have the same ordering as ∆. In this paper, we look at a generalization of this problem in which only a set of order relations of the form dij < dkl are provided. Unlike the original problem, these order relations can be contradictory and need not be specified for all pairs of dissimilarities. We argue that this setting is more natural in some experimental settings and propose an algorithm based on convex optimization techniques to solve this problem. We apply this algorithm to human subject data from a psychophysics experiment concerning how reflectance properties are perceived. We also look at the standard NMDS problem, where a dissimilarity matrix ∆ is provided as input, and show that we can always find an orderrespecting embedding of ∆. 1

