Results 1 - 10
of
17
Nonlinear component analysis as a kernel eigenvalue problem
-
, 1996
"... We describe a new method for performing a nonlinear form of Principal Component Analysis. By the use of integral operator kernel functions, we can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map; for instance the space of all ..."
Abstract
-
Cited by 775 (63 self)
- Add to MetaCart
We describe a new method for performing a nonlinear form of Principal Component Analysis. By the use of integral operator kernel functions, we can efficiently compute principal components in high-dimensional feature spaces, related to input space by some nonlinear map; for instance the space of all possible 5-pixel products in 16x16 images. We give the derivation of the method, along with a discussion of other techniques which can be made nonlinear with the kernel approach; and present first experimental results on nonlinear feature extraction for pattern recognition.
Kernel PCA and De-Noising in Feature Spaces
- ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 11
, 1999
"... Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compress ..."
Abstract
-
Cited by 94 (12 self)
- Add to MetaCart
Kernel PCA as a nonlinear feature extractor has proven powerful as a preprocessing step for classification algorithms. But it can also be considered as a natural generalization of linear principal component analysis. This gives rise to the question how to use nonlinear features for data compression, reconstruction, and de-noising, applications common in linear PCA. This is a nontrivial task, as the results provided by kernel PCA live in some high dimensional feature space and need not have pre-images in input space. This work presents ideas for finding approximate pre-images, focusing on Gaussian kernels, and shows experimental results using these pre-images in data reconstruction and de-noising on toy examples as well as on real world data.
Efficient Back Prop
, 1996
"... HINE Parameters X0, X1, ....Xp Output E0, E1,....Ep Error Desired Output D0, D1,...Dp Y0, Y1,...Yp Input w w0 w1 AT&T Laboratories (c) COST FUNCTION Output E0, E1,....Ep Error Desired Output D0, D1,...Dp Y0, Y1,...Yp X0, X1, ....Xp Input Parameters w B R A COMPUTING THE GRADIENT WITH BACKPROPAGATIO ..."
Abstract
-
Cited by 93 (16 self)
- Add to MetaCart
HINE Parameters X0, X1, ....Xp Output E0, E1,....Ep Error Desired Output D0, D1,...Dp Y0, Y1,...Yp Input w w0 w1 AT&T Laboratories (c) COST FUNCTION Output E0, E1,....Ep Error Desired Output D0, D1,...Dp Y0, Y1,...Yp X0, X1, ....Xp Input Parameters w B R A COMPUTING THE GRADIENT WITH BACKPROPAGATION O = A(I1, I2) dI1 = dO ¶ A ¶ I1 dI2 = dO ¶ A ¶ I2 - The learning machine is composed of modules (e.g. layers) - Each module can do two things: 1- compute its outputs from its inputs (FPROP) 2- compute gradient vectors at its inputs from gradient vectors at its outputs (BPROP) A O, dO I1, dI1 I2, dI2 AT&T Laboratories (c) AN INTERESTING SPECIAL CASE: MULTILAYER NETWORKS X0, X1, ....Xp Output Desired Output D0, D1,...Dp Y0, Y1,...Yp Input || D - Y || 2 2 1 WX F() WX F() Mean Square Error Parameters (weights + biases) w Weight matrix E0, E1,....Ep Sigmoids + Biase
Kernel PCA Pattern Reconstruction via Approximate Pre-Images
, 1998
"... Algorithms based on Mercer kernels construct their solutions in terms of expansions in a high-dimensional feature space F . Previous work has shown that all algorithms which can be formulated in terms of dot products in F can be performed using a kernel without explicitly working in F . The list of ..."
Abstract
-
Cited by 32 (2 self)
- Add to MetaCart
Algorithms based on Mercer kernels construct their solutions in terms of expansions in a high-dimensional feature space F . Previous work has shown that all algorithms which can be formulated in terms of dot products in F can be performed using a kernel without explicitly working in F . The list of such algorithms includes support vector machines and nonlinear kernel principal component extraction. So far, however, it did not include the reconstruction of patterns from their largest nonlinear principal components, a technique which is common practice in linear principal component analysis. The present work proposes an idea for approximately performing this task. As an illustrative example, an application to the de-noising of data clusters is presented. 1 Kernels and Feature Spaces A Mercer kernel is a function k(x; y) which for all data sets fx 1 ; : : : ; x ` g ae R N gives rise to a positive (not necessarily definite) matrix K ij := k(x i ; x j ) [4]. One can show that using k ins...
Invariant Feature Extraction and Classification in Kernel Spaces
"... We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinear variant of the Rayleigh coefficient, we propose non-linear generalizations of Fisher's discriminant and oriented PCA using S ..."
Abstract
-
Cited by 32 (7 self)
- Add to MetaCart
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinear variant of the Rayleigh coefficient, we propose non-linear generalizations of Fisher's discriminant and oriented PCA using Support Vector kernel functions.
Constructing Descriptive and Discriminative Nonlinear Features: Rayleigh Coefficients in Kernel Feature Spaces
, 2003
"... We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fisher's discriminant and oriented PCA using su ..."
Abstract
-
Cited by 30 (4 self)
- Add to MetaCart
We incorporate prior knowledge to construct nonlinear algorithms for invariant feature extraction and discrimination. Employing a unified framework in terms of a nonlinearized variant of the Rayleigh coefficient, we propose nonlinear generalizations of Fisher's discriminant and oriented PCA using support vector kernel functions. Extensive simulations show the utility of our approach.
Geometric Methods for Feature Extraction and Dimensional Reduction
- In L. Rokach and O. Maimon (Eds.), Data
, 2005
"... Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component anal ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
Abstract We give a tutorial overview of several geometric methods for feature extraction and dimensional reduction. We divide the methods into projective methods and methods that model the manifold on which the data lies. For projective methods, we review projection pursuit, principal component analysis (PCA), kernel PCA, probabilistic PCA, and oriented PCA; and for the manifold methods, we review multidimensional scaling (MDS), landmark MDS, Isomap, locally linear embedding, Laplacian eigenmaps and spectral clustering. The Nyström method, which links several of the algorithms, is also reviewed. The goal is to provide a self-contained review of the concepts and mathematics underlying these algorithms.
A robust PCA algorithm for building representations from panoramic images
- In European Conference Computer Vision
, 2002
"... Abstract. Appearance-based modeling of objects and scenes using PCA has been successfully applied in many recognition tasks. Robust methods which have made the recognition stage less susceptible to outliers, occlusions, and varying illumination have further enlarged the domain of applicability. Howe ..."
Abstract
-
Cited by 22 (9 self)
- Add to MetaCart
Abstract. Appearance-based modeling of objects and scenes using PCA has been successfully applied in many recognition tasks. Robust methods which have made the recognition stage less susceptible to outliers, occlusions, and varying illumination have further enlarged the domain of applicability. However, much less research has been done in achieving robustness in the learning stage. In this paper, we propose a novel robust PCA method for obtaining a consistent subspace representation in the presence of outlying pixels in the training images. The method is based on the EM algorithm for estimation of principal subspaces in the presence of missing data. By treating the outlying points as missing pixels, we arrive at a robust PCA representation. We demonstrate experimentally that the proposed method is efficient. In addition, we apply the method to a set of panoramic images to build a representation that enables surveillance and view-based mobile robot localization. 1
Redundant Bit Vectors for Quickly Searching High-Dimensional Regions
- In Deterministic and Statistical Methods in Machine Learning
, 2005
"... Abstract. Applications such as audio fingerprinting require search in high dimensions: find an item in a database that is similar to a query. An important property of this search task is that negative answers are very frequent: much of the time, a query does not correspond to any database item. We p ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Abstract. Applications such as audio fingerprinting require search in high dimensions: find an item in a database that is similar to a query. An important property of this search task is that negative answers are very frequent: much of the time, a query does not correspond to any database item. We propose Redundant Bit Vectors (RBVs):anovelmethodforquickly solving this search problem. RBVs rely on three key ideas: 1) approximate the high-dimensional regions/distributions as tightened hyperrectangles, 2) partition the query space to store each item redundantly in an index and 3) use bit vectors to store and search the index efficiently. We show that our method is the preferred method for very large databases or when the queries are often not in the database. Our method is 109 times faster than linear scan, and 48 times faster than localitysensitive hashing on a data set of 239369 audio fingerprints. 1
ORASSYLL: Object Recognition with Autonomously Learned and Sparse Symbolic Representations Based on Metrically Organized Local Line Detectors (Object Recognition with ORASSYLL)
, 2000
"... We introduce an object recognition and localization system in which objects are represented as a sparse and spatially organized set of local (bent) line segments. ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
We introduce an object recognition and localization system in which objects are represented as a sparse and spatially organized set of local (bent) line segments.

