Results 1 - 10
of
29
Canonical correlation analysis; An overview with application to learning methods
, 2007
"... We present a general method using kernel Canonical Correlation Analysis to learn a semantic representation to web images and their associated text. The semantic space provides a common representation and enables a comparison between the text and images. In the experiments we look at two approaches o ..."
Abstract
-
Cited by 98 (11 self)
- Add to MetaCart
We present a general method using kernel Canonical Correlation Analysis to learn a semantic representation to web images and their associated text. The semantic space provides a common representation and enables a comparison between the text and images. In the experiments we look at two approaches of retrieving images based only on their content from a text query. We compare the approaches against a standard cross-representation retrieval technique known as the Generalised Vector Space Model.
Euclidean embedding of co-occurrence data
- Advances in Neural Information Processing Systems 17
, 2005
"... Abstract Embedding algorithms search for low dimensional structure in complexdata, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for em-bedding objects of different types, such as images and text, into a single comm ..."
Abstract
-
Cited by 26 (2 self)
- Add to MetaCart
Abstract Embedding algorithms search for low dimensional structure in complexdata, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for em-bedding objects of different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. Thejoint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to con-vex optimization over positive semidefinite matrices. The local structure of our embedding corresponds to the statistical correlations via ran-dom walks in the Euclidean space. We quantify the performance of our method on two text datasets, and show that it consistently and signifi-cantly outperforms standard methods of statistical correspondence modeling, such as multidimensional scaling and correspondence analysis. 1 Introduction Embeddings of objects in a low-dimensional space are an important tool in unsupervisedlearning and in preprocessing data for supervised learning algorithms. They are especially valuable for exploratory data analysis and visualization by providing easily interpretablerepresentations of the relationships among objects. Most current embedding techniques build low dimensional mappings that preserve certain relationships among objects and dif-fer in the relationships they choose to preserve, which range from pairwise distances in multidimensional scaling (MDS) [4] to neighborhood structure in locally linear embedding[12]. All these methods operate on objects of a single type endowed with a measure of similarity or dissimilarity. However, real-world data often involve objects of several very different types without anatural measure of similarity. For example, typical web pages or scientific papers contain
Kernel methods for measuring independence
- Journal of Machine Learning Research
, 2005
"... We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prov ..."
Abstract
-
Cited by 25 (13 self)
- Add to MetaCart
We introduce two new functionals, the constrained covariance and the kernel mutual information, to measure the degree of independence of random variables. These quantities are both based on the covariance between functions of the random variables in reproducing kernel Hilbert spaces (RKHSs). We prove that when the RKHSs are universal, both functionals are zero if and only if the random variables are pairwise independent. We also show that the kernel mutual information is an upper bound near independence on the Parzen window estimate of the mutual information. Analogous results apply for two correlation-based dependence functionals introduced earlier: we show the kernel canonical correlation and the kernel generalised variance to be independence measures for universal kernels, and prove the latter to be an upper bound on the mutual information near independence. The performance of the kernel dependence functionals in measuring independence is verified in the context of independent component analysis.
Kernel pls-svc for linear and nonlinear classification
- In Proceedings of the twentieth International Conference on Machine Learning (ICML-2003
, 2003
"... A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reductio ..."
Abstract
-
Cited by 22 (4 self)
- Add to MetaCart
A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher’s approach to linear discrimination or equivalently to canonical correlation analysis. For this reason orthonormalized PLS is preferable to PCA for discrimination. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of classifying finger movement periods from non-movement periods based on electroencephalograms. 1.
Efficient Leave-One-Out Cross-Validation of Kernel Fisher Discriminant Classifiers
- PATTERN RECOGNITION
, 2003
"... Mika et al. [1] apply the "kernel trick" to obtain a non-linear variant of Fisher's linear discriminant analysis method, demonstrating state-of-the-art performance on a range of benchmark datasets. We show that leave-one-out cross-validation of kernel Fisher discriminant classifiers can be implement ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
Mika et al. [1] apply the "kernel trick" to obtain a non-linear variant of Fisher's linear discriminant analysis method, demonstrating state-of-the-art performance on a range of benchmark datasets. We show that leave-one-out cross-validation of kernel Fisher discriminant classifiers can be implemented with a computational complexity of only O(l³) operations rather than the O(l^4) of a nave implementation, where l is the number of training patterns. Leave-one-out cross-validation then becomes an attractive means of model selection in large-scale applications of kernel Fisher discriminant analysis, being significantly faster than conventional k-fold cross-validation procedures commonly used.
The Geometry of Kernel Canonical Correlation Analysis
, 2003
"... Canonical correlation analysis (CCA) is a classical multivariate method concerned with describing linear dependencies between sets of variables. After a short exposition of the linear sample CCA problem and its analytical solution, the article proceeds with a detailed characterization of its geometr ..."
Abstract
-
Cited by 12 (0 self)
- Add to MetaCart
Canonical correlation analysis (CCA) is a classical multivariate method concerned with describing linear dependencies between sets of variables. After a short exposition of the linear sample CCA problem and its analytical solution, the article proceeds with a detailed characterization of its geometry. Projection operators are used to illustrate the relations between canonical vectors and variates. The article then addresses the problem of CCA between spaces spanned by objects mapped into kernel feature spaces. An exact solution for this kernel canonical correlation (KCCA) problem is derived from a geometric point of view. It shows that the expansion coefficients of the canonical vectors in their respective feature space can be found by linear CCA in the basis induced by kernel principal component analysis. The effect of mappings into higher dimensional feature spaces is considered critically since it simplifies the CCA problem in general. Then two regularized variants of KCCA are discussed. Relations to other methods are illustrated, e.g., multicategory kernel Fisher discriminant analysis, kernel principal component regression and possible applications thereof in blind source separation.
A correlation approach for automatic image annotation
- In In Springer LNAI 4093
, 2006
"... Abstract. The automatic annotation of images presents a particularly complex problem for machine learning researchers. In this work we experiment with semantic models and multi-class learning for the automatic annotation of query images. We represent the images using scale invariant transformation d ..."
Abstract
-
Cited by 10 (3 self)
- Add to MetaCart
Abstract. The automatic annotation of images presents a particularly complex problem for machine learning researchers. In this work we experiment with semantic models and multi-class learning for the automatic annotation of query images. We represent the images using scale invariant transformation descriptors in order to account for similar objects appearing at slightly different scales and transformations. The resulting descriptors are utilised as visual terms for each image. We first aim to annotate query images by retrieving images that are similar to the query image. This approach uses the analogy that similar images would be annotated similarly as well. We then propose an image annotation method that learns a direct mapping from image descriptors to keywords. We compare the semantic based methods of Latent Semantic Indexing and Kernel Canonical Correlation Analysis (KCCA), as well as using a recently proposed vector label based learning method known as Maximum Margin Robot. 1
The Kernel Mutual Information
, 2003
"... We introduce two new functions, the kernel covariance (KC) and the kernel mutual information (KMI), to measure the degree of independence of several continuous random variables. The former is guaranteed to be zero if and only if the random variables are pairwise independent; the latter shares this p ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
We introduce two new functions, the kernel covariance (KC) and the kernel mutual information (KMI), to measure the degree of independence of several continuous random variables. The former is guaranteed to be zero if and only if the random variables are pairwise independent; the latter shares this property, and is in addition an approximate upper bound on the mutual information, as measured near independence, and is based on a kernel density estimate. We show that Bach and Jordan's kernel generalised variance (KGV) is also an upper bound on the same kernel density estimate, but is looser. Finally, we suggest that the addition of a regularising term in the KGV causes it to approach the KMI, which motivates the introduction of this regularisation. The performance of the KC and KMI is veried in the context of instantaneous independent component analysis (ICA), by recovering both articial and real (musical) signals following linear mixing.
Correlational spectral clustering
- In CVPR ’08
"... We present a new method for spectral clustering with paired data based on kernel canonical correlation analysis, called correlational spectral clustering. Paired data are common in real world data sources, such as images with text captions. Traditional spectral clustering algorithms either assume th ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
We present a new method for spectral clustering with paired data based on kernel canonical correlation analysis, called correlational spectral clustering. Paired data are common in real world data sources, such as images with text captions. Traditional spectral clustering algorithms either assume that data can be represented by a single similarity measure, or by co-occurrence matrices that are then used in biclustering. In contrast, the proposed method uses separate similarity measures for each data representation, and allows for projection of previously unseen data that are only observed in one representation (e.g. images but not text). We show that this algorithm generalizes traditional spectral clustering algorithms and show consistent empirical improvement over spectral clustering on a variety of datasets of images with associated text. 1.
Fusing Gabor and LBP Feature Sets for Kernel-Based Face Recognition
"... Abstract. Extending recognition to uncontrolled situations is a key challenge for practical face recognition systems. Finding efficient and discriminative facial appearance descriptors is crucial for this. Most existing approaches use features of just one type. Here we argue that robust recognition ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
Abstract. Extending recognition to uncontrolled situations is a key challenge for practical face recognition systems. Finding efficient and discriminative facial appearance descriptors is crucial for this. Most existing approaches use features of just one type. Here we argue that robust recognition requires several different kinds of appearance information to be taken into account, suggesting the use of heterogeneous feature sets. We show that combining two of the most successful local face representations, Gabor wavelets and Local Binary Patterns (LBP), gives considerably better performance than either alone: they are complimentary in the sense that LBP captures small appearance details while Gabor features encode facial shape over a broader range of scales. Both feature sets are high dimensional so it is beneficial to use PCA to reduce the dimensionality prior to normalization and integration. The Kernel Discriminative Common Vector method is then applied to the combined feature vector to extract discriminant nonlinear features for recognition. The method is evaluated on several challenging face datasets including FRGC 1.0.4, FRGC 2.0.4 and FERET, with promising results. 1

