Results 1 -
2 of
2
Efficient and Robust Feature Extraction by Maximum Margin Criterion
- In Advances in Neural Information Processing Systems 16
, 2003
"... In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are two most popular linear dimen-sionality reduction methods. Howev ..."
Abstract
-
Cited by 23 (3 self)
- Add to MetaCart
In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) are two most popular linear dimen-sionality reduction methods. However, PCA is not very effective for the extraction of the most discriminant features and LDA is not stable due to the small sample size problem. In this pa-per, we propose some new (linear and nonlinear) feature extractors based on maximum margin criterion (MMC). Geometrically, feature extractors based on MMC maximize the (average) margin between classes after dimensionality reduction. It is shown that MMC can represent class separability better than PCA. As a connection to LDA, we may also derive LDA from MMC by incorporating some constraints. By using some other constraints, we establish a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA. The kernelized (nonlinear) counterpart of this lin-ear feature extractor is also established in the paper. Our extensive experiments demonstrate that the new feature extractors are effective, stable, and efficient.
Robust and accurate cancer classification with gene expression profiling
- in Proc. 4th IEEE Comput. Syst. Bioinf. Conf
, 2005
"... Robust and accurate cancer classification is critical in cancer treatment. Gene expression profiling is expected to enable us to diagnose tumors precisely and systematically. However, the classification task in this context is very challenging because of the curse of dimensionality and the small sam ..."
Abstract
-
Cited by 3 (0 self)
- Add to MetaCart
Robust and accurate cancer classification is critical in cancer treatment. Gene expression profiling is expected to enable us to diagnose tumors precisely and systematically. However, the classification task in this context is very challenging because of the curse of dimensionality and the small sample size problem. In this paper, we propose a novel method to solve these two problems. Our method is able to map gene expression data into a very low dimensional space and thus meets the recommended samples to features per class ratio. As a result, it can be used to classify new samples robustly with low and trustable (estimated) error rates. The method is based on linear discriminant analysis (LDA). However, the conventional LDA requires that the within-class scatter matrix Sw be nonsingular. Unfortunately, Sw is always singular in the case of cancer classification due to the small sample size problem. To overcome this problem, we develop a generalized linear discriminant analysis (GLDA) that is a general, direct, and complete solution to optimize Fisher’s criterion. GLDA is mathematically well-founded and coincides with the conventional LDA when Sw is nonsingular. Different from the conventional LDA, GLDA does not assume the nonsingularity of Sw, and thus naturally solves the small sample size problem. To accommodate the high dimensionality of scatter matrices, a fast algorithm of GLDA is also developed. Our extensive experiments on seven public cancer datasets show that the method performs well. Especially on some difficult instances that have very small samples to genes per class ratios, our method achieves much higher accuracies than widely used classification methods such as support vector machines, random forests, etc. 1

