Results 1  10
of
39
SemiSupervised Learning Literature Survey
, 2006
"... We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter ..."
Abstract

Cited by 447 (8 self)
 Add to MetaCart
We review the literature on semisupervised learning, which is an area in machine learning and more generally, artificial intelligence. There has been a whole
spectrum of interesting ideas on how to learn from both labeled and unlabeled data, i.e. semisupervised learning. This document is a chapter excerpt from the author’s
doctoral thesis (Zhu, 2005). However the author plans to update the online version frequently to incorporate the latest development in the field. Please obtain the latest
version at http://www.cs.wisc.edu/~jerryzhu/pub/ssl_survey.pdf
Unsupervised Learning of Image Manifolds by Semidefinite Programming
, 2004
"... Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be ..."
Abstract

Cited by 162 (9 self)
 Add to MetaCart
Can we detect low dimensional structure in high dimensional data sets of images and video? The problem of dimensionality reduction arises often in computer vision and pattern recognition. In this paper, we propose a new solution to this problem based on semidefinite programming. Our algorithm can be used to analyze high dimensional data that lies on or near a low dimensional manifold. It overcomes certain limitations of previous work in manifold learning, such as Isomap and locally linear embedding. We illustrate the algorithm on easily visualized examples of curves and surfaces, as well as on actual images of faces, handwritten digits, and solid objects.
Harmonic mixtures: combining mixture models and graphbased methods for inductive and scalable semisupervised learning
 In Proc. Int. Conf. Machine Learning
, 2005
"... Graphbased methods for semisupervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or ..."
Abstract

Cited by 36 (2 self)
 Add to MetaCart
Graphbased methods for semisupervised learning have recently been shown to be promising for combining labeled and unlabeled data in classification problems. However, inference for graphbased methods often does not scale well to very large data sets, since it requires inversion of a large matrix or solution of a large linear program. Moreover, such approaches are inherently transductive, giving predictions for only those points in the unlabeled set, and not for an arbitrary test point. In this paper a new approach is presented that preserves the strengths of graphbased semisupervised learning while overcoming the limitations of scalability and noninductive inference, through a combination of generative mixture models and discriminative regularization using the graph Laplacian. Experimental results show that this approach preserves the accuracy of purely graphbased transductive methods when the data has “manifold structure, ” and at the same time achieves inductive learning with significantly reduced computational cost. 1.
An introduction to nonlinear dimensionality reduction by maximum variance unfolding
 Unfolding, Proceedings of the 21st National Conference on Artificial Intelligence
, 2006
"... ..."
Graph Laplacian Regularization for LargeScale . . .
, 2006
"... In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been ob ..."
Abstract

Cited by 25 (2 self)
 Add to MetaCart
In many areas of science and engineering, the problem arises how to discover low dimensional representations of high dimensional data. Recently, a number of researchers have converged on common solutions to this problem using methods from convex optimization. In particular, many results have been obtained by constructing semidefinite programs (SDPs) with low rank solutions. While the rank of matrix variables in SDPs cannot be directly constrained, it has been observed that low rank solutions emerge naturally by computing high variance or maximal trace solutions that respect local distance constraints. In this paper, we show how to solve very large problems of this type by a matrix factorization that leads to much smaller SDPs than those previously studied. The matrix factorization is derived by expanding the solution of the original problem in terms of the bottom eigenvectors of a graph Laplacian. The smaller SDPs obtained from this matrix factorization yield very good approximations to solutions of the original problem. Moreover, these approximations can be further refined by conjugate gradient descent. We illustrate the approach on localization in large scale sensor networks, where optimizations involving tens of thousands of nodes can be solved in just a few minutes.
Information Retrieval Perspective to Nonlinear Dimensionality Reduction for Data Visualization
"... Nonlinear dimensionality reduction methods are often used to visualize highdimensional data, although the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been welldefined. We ..."
Abstract

Cited by 24 (4 self)
 Add to MetaCart
Nonlinear dimensionality reduction methods are often used to visualize highdimensional data, although the existing methods have been designed for other related tasks such as manifold learning. It has been difficult to assess the quality of visualizations since the task has not been welldefined. We give a rigorous definition for a specific visualization task, resulting in quantifiable goodness measures and new visualization methods. The task is information retrieval given the visualization: to find similar data based on the similarities shown on the display. The fundamental tradeoff between precision and recall of information retrieval can then be quantified in visualizations as well. The user needs to give the relative cost of missing similar points vs. retrieving dissimilar points, after which the total cost can be measured. We then introduce a new method NeRV (neighbor retrieval visualizer) which produces an optimal visualization by minimizing the cost. We further derive a variant for supervised visualization; class information is taken rigorously into account when computing the similarity relationships. We show empirically that the unsupervised version outperforms existing unsupervised dimensionality reduction methods in the visualization task, and the supervised version outperforms existing supervised methods.
A duality view of spectral methods for dimensionality reduction
 In ICML ’06: Proceedings of the 23rd international conference on Machine learning
, 2006
"... We present a unified duality view of several recently emerged spectral methods for nonlinear dimensionality reduction, including Isomap, locally linear embedding, Laplacian eigenmaps, and maximum variance unfolding. We discuss the duality theory for the maximum variance unfolding problem, and show t ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
We present a unified duality view of several recently emerged spectral methods for nonlinear dimensionality reduction, including Isomap, locally linear embedding, Laplacian eigenmaps, and maximum variance unfolding. We discuss the duality theory for the maximum variance unfolding problem, and show that other methods are directly related to either its primal formulation or its dual formulation, or can be interpreted from the optimality conditions. This duality framework reveals close connections between these seemingly quite different algorithms. In particular, it resolves the myth about these methods in using either the top eigenvectors of a dense matrix, or the bottom eigenvectors of a sparse matrix — these two eigenspaces are exactly aligned at primaldual optimality. 1.
Fast lowrank semidefinite programming for embedding and clustering
 in Eleventh International Conference on Artifical Intelligence and Statistics, AISTATS 2007
, 2007
"... Many nonconvex problems in machine learning such as embedding and clustering have been solved using convex semidefinite relaxations. These semidefinite programs (SDPs) are expensive to solve and are hence limited to run on very small data sets. In this paper we show how we can improve the quality a ..."
Abstract

Cited by 10 (2 self)
 Add to MetaCart
Many nonconvex problems in machine learning such as embedding and clustering have been solved using convex semidefinite relaxations. These semidefinite programs (SDPs) are expensive to solve and are hence limited to run on very small data sets. In this paper we show how we can improve the quality and speed of solving a number of these problems by casting them as lowrank SDPs and then directly solving them using a nonconvex optimization algorithm. In particular, we show that problems such as the kmeans clustering and maximum variance unfolding (MVU) may be expressed exactly as lowrank SDPs and solved using our approach. We demonstrate that in the above problems our approach is significantly faster, far more scalable and often produces better results compared to traditional SDP relaxation techniques. 1
Modelbased scene analysis
 Computational Auditory Scene Analysis: Principles, Algorithms, and Applications, chapter 4
, 2006
"... When multiple sound sources are mixed together into a single channel (or a small number of channels) it is in general impossible to recover the exact waveforms that were mixed – indeed, without some kind of constraints on the form of the component signals it is impossible to separate them at all. Th ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
When multiple sound sources are mixed together into a single channel (or a small number of channels) it is in general impossible to recover the exact waveforms that were mixed – indeed, without some kind of constraints on the form of the component signals it is impossible to separate them at all. These constraints could take several
Map Building without Localization by Dimensionality Reduction Techniques
"... This paper proposes a new map building framework for mobile robot named LocalizationFree Mapping by Dimensionality Reduction (LFMDR). In this framework, the robot map building is interpreted as a problem of reconstructing the 2D coordinates of objects so that they maximally preserve the local prox ..."
Abstract

Cited by 6 (0 self)
 Add to MetaCart
This paper proposes a new map building framework for mobile robot named LocalizationFree Mapping by Dimensionality Reduction (LFMDR). In this framework, the robot map building is interpreted as a problem of reconstructing the 2D coordinates of objects so that they maximally preserve the local proximity of the objects in the space of robot’s observation history. Not only traditional linear PCA but also recent manifold learning techniques can be used for solving this problem. In contrast to the SLAM framework, LFMDR framework does not require localization procedures nor explicit measurement and motion models. In the latter part of this paper, we will demonstrate “visibilityonly ” and “bearingonly” localizationfree mappings which are derived by applying LFMDR framework to the visibility and bearing measurements respectively. 1.