## Non-Iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data From Fisher to Chernoff (2002)

Venue: | In Proceedings of the Joint IAPR International Workshops SSPR 2002 and SPR 2002, volume LNCS 2396 |

Citations: | 6 - 3 self |

### BibTeX

@INPROCEEDINGS{Loog02non-iterativeheteroscedastic,

author = {M. Loog and R. P. W. Duin},

title = {Non-Iterative Heteroscedastic Linear Dimension Reduction for Two-Class Data From Fisher to Chernoff},

booktitle = {In Proceedings of the Joint IAPR International Workshops SSPR 2002 and SPR 2002, volume LNCS 2396},

year = {2002},

pages = {508--517},

publisher = {Springer}

}

### OpenURL

### Abstract

Linear discriminant analysis (LDA) is a traditional solution to the linear dimension reduction (LDR) problem, which is based on the maximization of the between-class scatter over the withinclass scatter. This solution is incapable of dealing with heteroscedastic data in a proper way, because of the implicit assumption that the covariance matrices for all the classes are equal. Hence, discriminatory information in the difference between the covariance matrices is not used and, as a consequence, we can only reduce the data to a single dimension in the two-class case.

### Citations

2890 |
Introduction to statistical pattern recognition
- Fukunaga
- 1990
(Show Context)
Citation Context ...ormation matrix that maximizes the Fisher criterion is determined. This criterion gives, for a certain linear transformation, a measure of the between-class scatter over the within-class scatter (cf. =-=[7, 9]-=-). An attractive feature of LDA is the fast and easy way to determine this optimal linear transformation, merely requiring simple matrix arithmetics like addition, multiplication, and eigenvalue decom... |

1076 |
The use of multiple measurements in taxonomic problems
- Fisher
- 1936
(Show Context)
Citation Context ... well-known approach to supervised linear dimension reduction (LDR), or linear feature extraction, is linear discriminant analysis (LDA). This traditional and simple technique was developed by Fisher =-=[6] -=-for the two-class case, and extended by Rao [16] to handle the multi-class case. In LDA, a d × n transformation matrix that maximizes the Fisher criterion is determined. This criterion gives, for a c... |

781 |
UCI repository of machine learning databases. Machine-readable data repository
- Murphy, Aha
- 1992
(Show Context)
Citation Context ...ow vectors from U associated with the largest d singular values as the HLDR transformation. Tests were performed on three artificial [7] (cf. [12])—labelled (a) to (c), and seven real-world data set=-=s [8, 15]��-=-�labelled (d) to (j). To be able to see what discriminatory information is retained in using a HLDR, classification is done with a quadratic classifier assuming the underlying distributions to be norm... |

754 |
Algebra and Its Applications
- Strang, Linear
- 1980
(Show Context)
Citation Context ... − m2) t (αS1 + (1 − α)S2) −1 (m1 − m2) + 1 α(1 − α) log |(αS1 + (1 − α)S2)| |S1| α |S2| 1−α . Like ∂E, we can obtain ∂C as the trace of a positive semi-definite matrix SC. S=-=imple matrix manipulation [18] shows that this matrix -=-equals 2 (cf. [12]) 1 − SC :=S 2 (m1 − m2)(m1 − m2) t 1 − S 2 1 + α(1 − α) (log S − α log S1 − (1 − α) log S2) , where S := αS1 + (1 − α)S2. Now, before we get to our HLDR crit... |

742 | Statistical pattern recognition: a review
- Jain, Duin, et al.
- 2000
(Show Context)
Citation Context ...ormation matrix that maximizes the Fisher criterion is determined. This criterion gives, for a certain linear transformation, a measure of the between-class scatter over the within-class scatter (cf. =-=[7, 9]-=-). An attractive feature of LDA is the fast and easy way to determine this optimal linear transformation, merely requiring simple matrix arithmetics like addition, multiplication, and eigenvalue decom... |

423 |
Discriminant analysis and statistical pattern recognition,” Wiley-Interscience
- McLachlan
- 1992
(Show Context)
Citation Context ...maller than n and not only to a single dimension. We call our HLDR criterion the Chernoff criterion. Several alternative approaches to HLDR are known, of which we mention the following ones. See also =-=[14]-=-.sIn the two-class case, under the assumptions that both classes are normally distributed and that one wants a reduction to one dimension, Kazakos [10] reduces the LDR problem to a one-dimensional sea... |

306 |
Mathematical Statistics and Data Analysis
- Rice
- 1989
(Show Context)
Citation Context ...data sets, we restricted ourselves to discussing the main results, and to the most interesting observations. The p-values stated in this part are obtained by comparing the data via a signed rank test =-=[17]. -=-3.1 Fukunaga’s Heteroscedastic Two-Class Data and Two Variations Fukunaga [7] describes a heteroscedastic model consisting of two classes in eight dimensions. The classes are normally distributed wi... |

61 |
The utilization of multiple measurements in problems of biological classification
- Rao
- 1948
(Show Context)
Citation Context ...ion reduction (LDR), or linear feature extraction, is linear discriminant analysis (LDA). This traditional and simple technique was developed by Fisher [6] for the two-class case, and extended by Rao =-=[16] -=-to handle the multi-class case. In LDA, a d × n transformation matrix that maximizes the Fisher criterion is determined. This criterion gives, for a certain linear transformation, a measure of the be... |

53 |
Linear algebra and its applications,” Harcourt Brace Jovanovish Inc
- Strang
- 1988
(Show Context)
Citation Context ... − m2) t (αS1 + (1 − α)S2) −1 (m1 − m2) + 1 α(1 − α) log |(αS1 + (1 − α)S2)| |S1| α |S2| 1−α . Like ∂E, we can obtain ∂C as the trace of a positive semi-definite matrix SC. S=-=imple matrix manipulation [18] shows that this matrix -=-equals 2 (cf. [12]) 1 − SC :=S 2 (m1 − m2)(m1 − m2) t 1 − S 2 1 + α(1 − α) (log S − α log S1 − (1 − α) log S2) , where S := αS1 + (1 − α)S2. Now, before we get to our HLDR crit... |

23 |
Classification into two multivariate normal distributions with defferent covariance matrices, The Annals of Mathematical Statistics 33(2): 420–431
- Anderson, Bahadur
- 1962
(Show Context)
Citation Context ...e LDR problem to a one-dimensional search problem. Finding the optimal solution for this search problem, is equivalent to finding the optimal linear feature. The work of Kazakos is closely related to =-=[1]-=-. Other HLDR approaches for two-class problems were proposed by Malina [13], and Decell et al. [4, 5], of which the latter is also applicable in the multi-class case. These three approaches are also h... |

20 | A generalization of linear discriminant analysis in maximum likelihood framework
- Kumar, Andreou
- 1996
(Show Context)
Citation Context ...hat the maximization of them needs complex or iterative optimization procedures. Another iterative multi-class HLDR procedure, which is based on a maximum likelihood formulation of LDA, is studied in =-=[11]-=-. Here LDA is generalized by dropping the assumption that all classes have equal within-class covariance matrices and maximizing the likelihood for this model. A fast HLDR method based on a singular v... |

15 |
Approximate Pairwise Accuracy Criteria for Multiclass Linear Dimension Reduction: Generalizations of the Fisher Criterion. Delft Univ
- Loog
- 1999
(Show Context)
Citation Context ...cf. [7]). Our generalization takes into account the discriminatory information that is present in the difference of the covariance matrices. This is done by means of directed distance matrices (DDMs) =-=[12]-=-, which are generalizations of the between-class covariance matrix. This between-class covariance matrix, as used in LDA, merely takes into account the discriminatory information that is present in th... |

13 |
On information and distance measures, error bounds, and feature selection. The information scientist
- Chen
- 1979
(Show Context)
Citation Context ...eans and can be associated with the Euclidean distance. The specific heteroscedastic generalization of the Fisher criterion, that we study more closely in Section 2, is based on the Chernoff distance =-=[2, 3]. Th-=-is measure of affinity of two densities considers mean differences as well as covariance differences—as opposed to the Euclidean distance—and can be used to extend LDA in such a way that we retain... |

9 |
Feature Combinations and the Divergence Criterion
- Decell, Mayekar
- 1977
(Show Context)
Citation Context ...oblem, is equivalent to finding the optimal linear feature. The work of Kazakos is closely related to [1]. Other HLDR approaches for two-class problems were proposed by Malina [13], and Decell et al. =-=[4, 5]-=-, of which the latter is also applicable in the multi-class case. These three approaches are also heteroscedastic generalizations of the Fisher criterion. [13] uses scatter measures different to the o... |

7 |
Measures of distance between probability distributions
- Chung, Kannappan, et al.
- 1989
(Show Context)
Citation Context ...eans and can be associated with the Euclidean distance. The specific heteroscedastic generalization of the Fisher criterion, that we study more closely in Section 2, is based on the Chernoff distance =-=[2, 3]. Th-=-is measure of affinity of two densities considers mean differences as well as covariance differences—as opposed to the Euclidean distance—and can be used to extend LDA in such a way that we retain... |

7 |
ter Haar Romeny. Automatic segmentation of lung fields in chest radiographs
- Ginneken, M
(Show Context)
Citation Context ...ow vectors from U associated with the largest d singular values as the HLDR transformation. Tests were performed on three artificial [7] (cf. [12])—labelled (a) to (c), and seven real-world data set=-=s [8, 15]��-=-�labelled (d) to (j). To be able to see what discriminatory information is retained in using a HLDR, classification is done with a quadratic classifier assuming the underlying distributions to be norm... |

7 | On an extended Fisher criterion for feature selection - Malina - 1981 |

4 |
Linear dimension reduction and Bayes classification
- Tubbs, Coberly, et al.
- 1982
(Show Context)
Citation Context ...assumption that all classes have equal within-class covariance matrices and maximizing the likelihood for this model. A fast HLDR method based on a singular value decomposition (svd) was developed in =-=[19]-=- by Tubbs et al. We discuss this method in more detail in Section 3, where we also compare our non-iterative method to theirs. The comparison is done on three artificial data sets and seven real-world... |

2 |
On the optimal linear feature
- Kazakos
- 1978
(Show Context)
Citation Context ...of which we mention the following ones. See also [14].sIn the two-class case, under the assumptions that both classes are normally distributed and that one wants a reduction to one dimension, Kazakos =-=[10]-=- reduces the LDR problem to a one-dimensional search problem. Finding the optimal solution for this search problem, is equivalent to finding the optimal linear feature. The work of Kazakos is closely ... |

1 |
Feature combinations and the Bhattacharyya criterion
- Decell, Marani
- 1976
(Show Context)
Citation Context ...oblem, is equivalent to finding the optimal linear feature. The work of Kazakos is closely related to [1]. Other HLDR approaches for two-class problems were proposed by Malina [13], and Decell et al. =-=[4, 5]-=-, of which the latter is also applicable in the multi-class case. These three approaches are also heteroscedastic generalizations of the Fisher criterion. [13] uses scatter measures different to the o... |