## Mathematical Programming in Data Mining (1996)

Venue: | Data Mining and Knowledge Discovery |

Citations: | 26 - 3 self |

### BibTeX

@ARTICLE{Mangasarian96mathematicalprogramming,

author = {O. L. Mangasarian},

title = {Mathematical Programming in Data Mining},

journal = {Data Mining and Knowledge Discovery},

year = {1996},

volume = {42},

pages = {183--201}

}

### Years of Citing Articles

### OpenURL

### Abstract

Mathematical programming approaches to three fundamental problems will be described: feature selection, clustering and robust representation. The feature selection problem considered is that of discriminating between two sets while recognizing irrelevant and redundant features and suppressing them. This creates a lean model that often generalizes better to new unseen data. Computational results on real data confirm improved generalization of leaner models. Clustering is exemplified by the unsupervised learning of patterns and clusters that may exist in a given database and is a useful tool for knowledge discovery in databases (KDD). A mathematical programming formulation of this problem is proposed that is theoretically justifiable and computationally implementable in a finite number of steps. A resulting k-Median Algorithm is utilized to discover very useful survival curves for breast cancer patients from a medical database. Robust representation is concerned with minimizing trained m...