Results 1  10
of
379
Face recognition using laplacianfaces
 IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2005
"... Abstract—We propose an appearancebased face recognition method called the Laplacianface approach. By using Locality Preserving Projections (LPP), the face images are mapped into a face subspace for analysis. Different from Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) wh ..."
Abstract

Cited by 376 (40 self)
 Add to MetaCart
(Show Context)
Abstract—We propose an appearancebased face recognition method called the Laplacianface approach. By using Locality Preserving Projections (LPP), the face images are mapped into a face subspace for analysis. Different from Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) which effectively see only the Euclidean structure of face space, LPP finds an embedding that preserves local information, and obtains a face subspace that best detects the essential face manifold structure. The Laplacianfaces are the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the face manifold. In this way, the unwanted variations resulting from changes in lighting, facial expression, and pose may be eliminated or reduced. Theoretical analysis shows that PCA, LDA, and LPP can be obtained from different graph models. We compare the proposed Laplacianface approach with Eigenface and Fisherface methods on three different face data sets. Experimental results suggest that the proposed Laplacianface approach provides a better representation and achieves lower error rates in face recognition. Index Terms—Face recognition, principal component analysis, linear discriminant analysis, locality preserving projections, face manifold, subspace learning. 1
Diffusion Wavelets
, 2004
"... We present a multiresolution construction for efficiently computing, compressing and applying large powers of operators that have high powers with low numerical rank. This allows the fast computation of functions of the operator, notably the associated Green’s function, in compressed form, and their ..."
Abstract

Cited by 149 (16 self)
 Add to MetaCart
(Show Context)
We present a multiresolution construction for efficiently computing, compressing and applying large powers of operators that have high powers with low numerical rank. This allows the fast computation of functions of the operator, notably the associated Green’s function, in compressed form, and their fast application. Classes of operators satisfying these conditions include diffusionlike operators, in any dimension, on manifolds, graphs, and in nonhomogeneous media. In this case our construction can be viewed as a farreaching generalization of Fast Multipole Methods, achieved through a different point of view, and of the nonstandard wavelet representation of CalderónZygmund and pseudodifferential operators, achieved through a different multiresolution analysis adapted to the operator. We show how the dyadic powers of an operator can be used to induce a multiresolution analysis, as in classical LittlewoodPaley and wavelet theory, and we show how to construct, with fast and stable algorithms, scaling function and wavelet bases associated to this multiresolution analysis, and the corresponding downsampling operators, and use them to compress the corresponding powers of the operator. This allows to extend multiscale signal processing to general spaces (such as manifolds and graphs) in a very natural way, with corresponding fast algorithms.
Covariate shift adaptation by importance weighted cross validation
, 2000
"... A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapo ..."
Abstract

Cited by 121 (53 self)
 Add to MetaCart
A common assumption in supervised learning is that the input points in the training set follow the same probability distribution as the input points that will be given in the future test phase. However, this assumption is not satisfied, for example, when the outside of the training region is extrapolated. The situation where the training input points and test input points follow different distributions while the conditional distribution of output values given input points is unchanged is called the covariate shift. Under the covariate shift, standard model selection techniques such as cross validation do not work as desired since its unbiasedness is no longer maintained. In this paper, we propose a new method called importance weighted cross validation (IWCV), for which we prove its unbiasedness even under the covariate shift. The IWCV procedure is the only one that can be applied for unbiased classification under covariate shift, whereas alternatives to IWCV exist for regression. The usefulness of our proposed method is illustrated by simulations, and furthermore demonstrated in the braincomputer interface, where strong nonstationarity effects can be seen between training and test sessions. c2000 Masashi Sugiyama, Matthias Krauledat, and KlausRobert Müller.
Dimensionality reduction of multimodal labeled data by local Fisher discriminant analysis
 Journal of Machine Learning Research
, 2007
"... Reducing the dimensionality of data without losing intrinsic information is an important preprocessing step in highdimensional data analysis. Fisher discriminant analysis (FDA) is a traditional technique for supervised dimensionality reduction, but it tends to give undesired results if samples in a ..."
Abstract

Cited by 119 (11 self)
 Add to MetaCart
(Show Context)
Reducing the dimensionality of data without losing intrinsic information is an important preprocessing step in highdimensional data analysis. Fisher discriminant analysis (FDA) is a traditional technique for supervised dimensionality reduction, but it tends to give undesired results if samples in a class are multimodal. An unsupervised dimensionality reduction method called localitypreserving projection (LPP) can work well with multimodal data due to its locality preserving property. However, since LPP does not take the label information into account, it is not necessarily useful in supervised learning scenarios. In this paper, we propose a new linear supervised dimensionality reduction method called local Fisher discriminant analysis (LFDA), which effectively combines the ideas of FDA and LPP. LFDA has an analytic form of the embedding transformation and the solution can be easily computed just by solving a generalized eigenvalue problem. We demonstrate the practical usefulness and high scalability of the LFDA method in data visualization and classification tasks through extensive simulation studies. We also show that LFDA can be extended to nonlinear dimensionality reduction scenarios by applying the kernel trick.
Semisupervised discriminant analysis
 in Proc. of the IEEE Int’l Conf. on Comp. Vision (ICCV), Rio De Janeiro
, 2007
"... Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. The projection vectors are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. In practice, when there is no suf ..."
Abstract

Cited by 97 (2 self)
 Add to MetaCart
(Show Context)
Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. The projection vectors are commonly obtained by maximizing the between class covariance and simultaneously minimizing the within class covariance. In practice, when there is no sufficient training samples, the covariance matrix of each class may not be accurately estimated. In this paper, we propose a novel method, called Semisupervised Discriminant Analysis (SDA), which makes use of both labeled and unlabeled samples. The labeled data points are used to maximize the separability between different classes and the unlabeled data points are used to estimate the intrinsic geometric structure of the data. Specifically, we aim to learn a discriminant function which is as smooth as possible on the data manifold. Experimental results on single training image face recognition and relevance feedback image retrieval demonstrate the effectiveness of our algorithm. 1.
Orthogonal laplacianfaces for face recognition
 IEEE Trans. Image Process
, 2006
"... [30] V. Patrascu and V. Buzuloiu, “Image dynamic range enhancement in ..."
Abstract

Cited by 77 (3 self)
 Add to MetaCart
[30] V. Patrascu and V. Buzuloiu, “Image dynamic range enhancement in
A Dirty Model for Multitask Learning
 In NIPS
, 2010
"... We consider multitask learning in the setting of multiple linear regression, and where some relevant features could be shared across the tasks. Recent research has studied the use ofℓ1/ℓq norm blockregularizations withq> 1 for such blocksparse structured problems, establishing strong guarantees ..."
Abstract

Cited by 64 (2 self)
 Add to MetaCart
(Show Context)
We consider multitask learning in the setting of multiple linear regression, and where some relevant features could be shared across the tasks. Recent research has studied the use ofℓ1/ℓq norm blockregularizations withq> 1 for such blocksparse structured problems, establishing strong guarantees on recovery even under highdimensional scaling where the number of features scale with the number of observations. However, these papers also caution that the performance of such blockregularized methods are very dependent on the extent to which the features are shared across tasks. Indeed they show [8] that if the extent of overlap is less than a threshold, or even if parameter values in the shared features are highly uneven, then block ℓ1/ℓq regularization could actually perform worse than simple separate elementwise ℓ1 regularization. Since these caveats depend on the unknown true parameters, we might not know when and which method to apply. Even otherwise, we are far away from a realistic multitask setting: not only do the set of relevant features have to be exactly the same across tasks, but their values
Learning an image manifold for retrieval
 In Proc. ACM Multimedia
, 2004
"... We consider the problem of learning a mapping function from lowlevel feature space to highlevel semantic space. Under the assumption that the data lie on a submanifold embedded in a high dimensional Euclidean space, we propose a relevance feedback scheme which is naturally conducted only on the im ..."
Abstract

Cited by 60 (4 self)
 Add to MetaCart
(Show Context)
We consider the problem of learning a mapping function from lowlevel feature space to highlevel semantic space. Under the assumption that the data lie on a submanifold embedded in a high dimensional Euclidean space, we propose a relevance feedback scheme which is naturally conducted only on the image manifold in question rather than the total ambient space. While images are typically represented by feature vectors in R n, the natural distance is often different from the distance induced by the ambient space R n. The geodesic distances on manifold are used to measure the similarities between images. However, when the number of data points is small, it is hard to discover the intrinsic manifold structure. Based on user interactions in a relevance feedback driven querybyexample system, the intrinsic similarities between images can be accurately estimated. We then develop an algorithmic framework to approximate the optimal mapping function by a Radial Basis Function (RBF) neural network. The semantics of a new image can be inferred by the RBF neural network. Experimental results show that our approach is effective in improving the performance of contentbased image retrieval systems.
Clustering and Embedding using Commute Times
 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
"... This paper exploits the properties of the commute time between nodes of a graph for the purposes of clustering and embedding, and explores its applications to image segmentation and multibody motion tracking. Our starting point is the lazy random walk on the graph, which is determined by the heatke ..."
Abstract

Cited by 54 (4 self)
 Add to MetaCart
(Show Context)
This paper exploits the properties of the commute time between nodes of a graph for the purposes of clustering and embedding, and explores its applications to image segmentation and multibody motion tracking. Our starting point is the lazy random walk on the graph, which is determined by the heatkernel of the graph and can be computed from the spectrum of the graph Laplacian. We characterize the random walk using the commute time (i.e. the expected time taken for a random walk to travel between two nodes and return) and show how this quantity may be computed from the Laplacian spectrum using the discrete Green’s function. Our motivation is that the commute time can be anticipated to be a more robust measure of the proximity of data than the raw proximity matrix. In this paper, we explore two applications of the commute time. The first is to develop a method for image segmentation using the eigenvector corresponding to the smallest eigenvalue of the commute time matrix. We show that our commute time segmentation method has the property of enhancing the intragroup coherence while weakening intergroup coherence and is superior to the normalized cut. The second application is to develop a robust multibody motion tracking method using an embedding based on the commute time. Our embedding procedure preserves commute time, and is closely akin to kernel PCA, the Laplacian eigenmap and the diffusion map. We illustrate the results both on synthetic image sequences and real world video sequences, and compare our results with several alternative methods.