Results 1 - 10
of
52
Graph embedding and extension: A general framework for dimensionality reduction
- IEEE TRANS. PATTERN ANAL. MACH. INTELL
, 2007
"... Over the past few decades, a large family of algorithms—supervised or unsupervised; stemming from statistics or geometry theory—has been designed to provide different solutions to the problem of dimensionality reduction. Despite the different motivations of these algorithms, we present in this paper ..."
Abstract
-
Cited by 271 (29 self)
- Add to MetaCart
(Show Context)
Over the past few decades, a large family of algorithms—supervised or unsupervised; stemming from statistics or geometry theory—has been designed to provide different solutions to the problem of dimensionality reduction. Despite the different motivations of these algorithms, we present in this paper a general formulation known as graph embedding to unify them within a common framework. In graph embedding, each algorithm can be considered as the direct graph embedding or its linear/kernel/tensor extension of a specific intrinsic graph that describes certain desired statistical or geometric properties of a data set, with constraints from scale normalization or a penalty graph that characterizes a statistical or geometric property that should be avoided. Furthermore, the graph embedding framework can be used as a general platform for developing new dimensionality reduction algorithms. By utilizing this framework as a tool, we propose a new supervised dimensionality reduction algorithm called Marginal Fisher Analysis in which the intrinsic graph characterizes the intraclass compactness and connects each data point with its neighboring points of the same class, while the penalty graph connects the marginal points and characterizes the interclass separability. We show that MFA effectively overcomes the limitations of the traditional Linear Discriminant Analysis algorithm due to data distribution assumptions and available projection directions. Real face recognition experiments show the superiority of our proposed MFA in comparison to LDA, also for corresponding kernel and tensor extensions.
Characterization of a family of algorithms for generalized discriminant analysis on undersampled problems
- Journal of Machine Learning Research
, 2005
"... A generalized discriminant analysis based on a new optimization criterion is presented. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) when the scatter matrices are singular. An efficient algorithm for the new optimization problem is presented. Th ..."
Abstract
-
Cited by 72 (13 self)
- Add to MetaCart
A generalized discriminant analysis based on a new optimization criterion is presented. The criterion extends the optimization criteria of the classical Linear Discriminant Analysis (LDA) when the scatter matrices are singular. An efficient algorithm for the new optimization problem is presented. The solutions to the proposed criterion form a family of algorithms for generalized LDA, which can be characterized in a closed form. We study two specific algorithms, namely Uncorrelated LDA (ULDA) and Orthogonal LDA (OLDA). ULDA was previously proposed for feature extraction and dimension reduction, whereas OLDA is a novel algorithm proposed in this paper. The features in the reduced space of ULDA are uncorrelated, while the discriminant vectors of OLDA are orthogonal to each other. We have conducted a comparative study on a variety of real-world data sets to evaluate ULDA and OLDA in terms of classification accuracy.
Capitalize on dimensionality increasing techniques for improving face recognition grand challenge performance
- IEEE TPAMI
, 2006
"... Abstract—This paper presents a novel pattern recognition framework by capitalizing on dimensionality increasing techniques. In particular, the framework integrates Gabor image representation, a novel multiclass Kernel Fisher Analysis (KFA) method, and fractional power polynomial models for improving ..."
Abstract
-
Cited by 56 (2 self)
- Add to MetaCart
(Show Context)
Abstract—This paper presents a novel pattern recognition framework by capitalizing on dimensionality increasing techniques. In particular, the framework integrates Gabor image representation, a novel multiclass Kernel Fisher Analysis (KFA) method, and fractional power polynomial models for improving pattern recognition performance. Gabor image representation, which increases dimensionality by incorporating Gabor filters with different scales and orientations, is characterized by spatial frequency, spatial locality, and orientational selectivity for coping with image variabilities such as illumination variations. The KFA method first performs nonlinear mapping from the input space to a high-dimensional feature space, and then implements the multiclass Fisher discriminant analysis in the feature space. The significance of the nonlinear mapping is that it increases the discriminating power of the KFA method, which is linear in the feature space but nonlinear in the input space. The novelty of the KFA method comes from the fact that 1) it extends the two-class kernel Fisher methods by addressing multiclass pattern classification problems and 2) it improves upon the traditional Generalized Discriminant Analysis (GDA) method by deriving a unique solution (compared to the GDA solution, which is not unique). The fractional power polynomial models further improve performance of the proposed pattern recognition framework. Experiments on face recognition using both the FERET database and the FRGC (Face Recognition Grand Challenge) databases show the feasibility of the proposed framework. In particular, experimental results using the FERET database show that the KFA method performs better than the GDA method and the fractional power polynomial models help both the KFA method and the GDA method improve their face recognition performance. Experimental results using the FRGC databases show that the proposed pattern recognition framework improves face
Discriminant analysis with tensor representation
- in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2005
, 2005
"... In this paper, we present a novel approach to solving the supervised dimensionality reduction problem by encoding an image object as a general tensor of 2nd or higher order. First, we propose a Discriminant Tensor Criterion (DTC), whereby multiple interrelated lower-dimensional discriminative subspa ..."
Abstract
-
Cited by 53 (13 self)
- Add to MetaCart
(Show Context)
In this paper, we present a novel approach to solving the supervised dimensionality reduction problem by encoding an image object as a general tensor of 2nd or higher order. First, we propose a Discriminant Tensor Criterion (DTC), whereby multiple interrelated lower-dimensional discriminative subspaces are derived for feature selection. Then, a novel approach called k-mode Cluster-based Discriminant Analysis is presented to iteratively learn these subspaces by unfolding the tensor along different tensor dimensions. We call this algorithm Discriminant Analysis with Tensor Representation (DATER), which has the following characteristics: 1) multiple interrelated subspaces can collaborate to discriminate different classes; 2) for classification problems involving higher-order tensors, the DATER algorithm can avoid the curse of dimensionality dilemma and overcome the small sample size problem; and 3) the computational cost in the learning stage is reduced to a large extent owing to the reduced data dimensions in generalized eigenvalue decomposition. We provide extensive experiments by encoding face images as 2nd or 3rd order tensors to demonstrate that the proposed DATER algorithm based on higher order tensors has the potential to outperform the traditional subspace learning algorithms, especially in the small sample size cases. 1.
Geometric mean for subspace selection
- TIANJIN UNIVERSITY. Downloaded on December 8, 2009 at 04:33 from IEEE Xplore. Restrictions apply. YUAN et al.: BINARY SPARSE NONNEGATIVE MATRIX FACTORIZATION 777
, 2009
"... Abstract—Subspace selection approaches are powerful tools in pattern classification and data visualization. One of the most important subspace approaches is the linear dimensionality reduction step in the Fisher’s linear discriminant analysis (FLDA), which has been successfully employed in many fiel ..."
Abstract
-
Cited by 52 (11 self)
- Add to MetaCart
(Show Context)
Abstract—Subspace selection approaches are powerful tools in pattern classification and data visualization. One of the most important subspace approaches is the linear dimensionality reduction step in the Fisher’s linear discriminant analysis (FLDA), which has been successfully employed in many fields such as biometrics, bioinformatics, and multimedia information management. However, the linear dimensionality reduction step in FLDA has a critical drawback: for a classification task with c classes, if the dimension of the projected subspace is strictly lower than c 1, the projection to a subspace tends to merge those classes, which are close together in the original feature space. If separate classes are sampled from Gaussian distributions, all with identical covariance matrices, then the linear dimensionality reduction step in FLDA maximizes the mean value of the Kullback-Leibler (KL) divergences between different classes. Based on this viewpoint, the geometric mean for subspace selection is studied in this paper. Three criteria are analyzed: 1) maximization of the geometric mean of the KL divergences, 2) maximization of the geometric mean of the normalized KL divergences, and 3) the combination of 1 and 2. Preliminary experimental results based on synthetic data, UCI Machine Learning Repository, and handwriting digits show that the third criterion is a potential discriminative subspace selection method, which significantly reduces the class separation problem in comparing with the linear dimensionality reduction step in FLDA and its several representative extensions. Index Terms—Arithmetic mean, Fisher’s linear discriminant analysis (FLDA), geometric mean, Kullback-Leibler (KL) divergence, machine learning, subspace selection (or dimensionality reduction), visualization. Ç 1
A two-stage linear discriminant analysis via QR-decomposition
- IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
, 2005
"... Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of classical LDA is the so-called singularity proble ..."
Abstract
-
Cited by 48 (0 self)
- Add to MetaCart
Linear Discriminant Analysis (LDA) is a well-known method for feature extraction and dimension reduction. It has been used widely in many applications involving high-dimensional data, such as image and text classification. An intrinsic limitation of classical LDA is the so-called singularity problems; that is, it fails when all scatter matrices are singular. Many LDA extensions were proposed in the past to overcome the singularity problems. Among these extensions, PCA+LDA, a two-stage method, received relatively more attention. In PCA+LDA, the LDA stage is preceded by an intermediate dimension reduction stage using Principal Component Analysis (PCA). Most previous LDA extensions are computationally expensive, and not scalable, due to the use of Singular Value Decomposition or Generalized Singular Value Decomposition. In this paper, we propose a two-stage LDA method, namely LDA/QR, which aims to overcome the singularity problems of classical LDA, while achieving efficiency and scalability simultaneously. The key difference between LDA/QR and PCA+LDA lies in the first stage, where LDA/QR applies QR decomposition to a small matrix involving the class centroids, while PCA+LDA applies PCA to the total scatter matrix involving all training data points. We further justify the proposed algorithm by showing the relationship among LDA/QR and previous LDA methods. Extensive experiments on face images and text documents are presented to show the effectiveness of the proposed algorithm.
Concurrent subspaces analysis
- in Proc. IEEE Conf. Comput. Vision Pattern Recognit., 2005
, 2005
"... A representative subspace is significant for image analysis, while the corresponding techniques often suffer from the curse of dimensionality dilemma. In this paper, we propose a new algorithm, called Concurrent Subspaces Analysis (CSA), to derive representative subspaces by encoding image objects a ..."
Abstract
-
Cited by 41 (5 self)
- Add to MetaCart
(Show Context)
A representative subspace is significant for image analysis, while the corresponding techniques often suffer from the curse of dimensionality dilemma. In this paper, we propose a new algorithm, called Concurrent Subspaces Analysis (CSA), to derive representative subspaces by encoding image objects as 2 nd or even higher order tensors. In CSA, an original higher dimensional tensor is transformed into a lower dimensional one using multiple concurrent subspaces that characterize the most representative information of different dimensions, respectively. Moreover, an efficient procedure is provided to learn these subspaces in an iterative manner. As analyzed in this paper, each sub-step of CSA takes the column vectors of the matrices, which are acquired from the k-mode unfolding of the tensors, as the new objects to be analyzed, thus the curse of dimensionality dilemma can be effectively avoided. The extensive experiments on the 3 rd order tensor data, simulated video sequences and Gabor filtered digital number image database show that CSA outperforms Principal Component Analysis in terms of both reconstruction and classification capability. 1.
Incremental linear discriminant analysis for classification of data streams
- IEEE Transactions on Systems, Man and Cybernetics - Part B
"... Abstract—This paper presents a constructive method for de-riving an updated discriminant eigenspace for classification when bursts of data that contains new classes is being added to an initial discriminant eigenspace in the form of random chunks. Basically, we propose an incremental linear discrimi ..."
Abstract
-
Cited by 36 (5 self)
- Add to MetaCart
Abstract—This paper presents a constructive method for de-riving an updated discriminant eigenspace for classification when bursts of data that contains new classes is being added to an initial discriminant eigenspace in the form of random chunks. Basically, we propose an incremental linear discriminant analysis (ILDA) in its two forms: a sequential ILDA and a Chunk ILDA. In experi-ments, we have tested ILDA using datasets with a small number of classes and small-dimensional features, as well as datasets with a large number of classes and large-dimensional features. We have compared the proposed ILDA against the traditional batch LDA in terms of discriminability, execution time and memory usage with the increasing volume of data addition. The results show that the proposed ILDA can effectively evolve a discriminant eigenspace over a fast and large data stream, and extract features with superior discriminability in classification, when compared with other methods. Index Terms—Classification, data stream, incremental linear discriminant analysis, incremental principle component analysis, linear discriminant analysis, pattern recognition, principle com-ponent analysis.
Using uncorrelated discriminant analysis for tissue classification with gene expression data
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
, 2004
"... Abstract—The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high (in the thousands) compared to the number of data samples (in the tens or low hundred ..."
Abstract
-
Cited by 32 (5 self)
- Add to MetaCart
(Show Context)
Abstract—The classification of tissue samples based on gene expression data is an important problem in medical diagnosis of diseases such as cancer. In gene expression data, the number of genes is usually very high (in the thousands) compared to the number of data samples (in the tens or low hundreds); that is, the data dimension is large compared to the number of data points (such data is said to be undersampled). To cope with performance and accuracy problems associated with high dimensionality, it is commonplace to apply a preprocessing step that transforms the data to a space of significantly lower dimension with limited loss of the information present in the original data. Linear Discriminant Analysis (LDA) is a well-known technique for dimension reduction and feature extraction, but it is not applicable for undersampled data due to singularity problems associated with the matrices in the underlying representation. This paper presents a dimension reduction and feature extraction scheme, called Uncorrelated Linear Discriminant Analysis (ULDA), for undersampled problems and illustrates its utility on gene expression data. ULDA employs the Generalized Singular Value Decomposition method to handle undersampled data and the features that it produces in the transformed space are uncorrelated, which makes it attractive for gene expression data. The properties of ULDA are established rigorously and extensive experimental results on gene expression data are presented to illustrate its effectiveness in classifying tissue samples. These results provide a comparative study of various state-of-the-art classification methods on well-known gene expression data sets. Index Terms—Microarray data analysis, discriminant analysis, generalized singular value decomposition, classification. 1
Feature reduction via generalized uncorrelated linear discriminant analysis
- IEEE Trans. Knowl. Data Eng
, 2006
"... High-dimensional data appear in many applications of data mining, machine learning, and bioinformatics. Feature reduction is commonly applied as a preprocessing step to overcome the curse of dimensionality. Uncorrelated Linear Discriminant Analysis (ULDA) was recently proposed for feature reduction. ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
(Show Context)
High-dimensional data appear in many applications of data mining, machine learning, and bioinformatics. Feature reduction is commonly applied as a preprocessing step to overcome the curse of dimensionality. Uncorrelated Linear Discriminant Analysis (ULDA) was recently proposed for feature reduction. The extracted features via ULDA were shown to be statistically uncorrelated, which is desirable for many applications. In this paper, an algorithm called ULDA/QR is proposed to simplify the previous implementation of ULDA. Then the ULDA/GSVD algorithm is proposed based on a novel optimization criterion, to address the singularity problem which occurs in undersampled problems, where the data dimension is larger than the data size. The criterion used is the regularized version of the one in ULDA/QR. Surprisingly, our theoretical result shows that the solution to ULDA/GSVD is independent of the value of the regularization parameter. Experimental results on various types of datasets are