## Instance pruning techniques (1997)

Venue: | MACHINE LEARNING: PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE (ICML’97 |

### BibTeX

@INPROCEEDINGS{Wilson97instancepruning,

author = {D. Randall Wilson and Tony R. Martinez},

title = {Instance pruning techniques},

booktitle = {MACHINE LEARNING: PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE (ICML’97},

year = {1997},

pages = {404--411},

publisher = {Morgan Kaufmann}

}

### Abstract

The nearest neighbor algorithm and its derivatives are often quite successful at learning a concept from a training set and providing good generalization on subsequent input vectors. However, these techniques often retain the entire training set in memory, resulting in large memory requirements and slow execution speed, as well as a sensitivity to noise. This paper provides a discussion of issues related to reducing the number of instances retained in memory while maintaining (and sometimes improving) generalization accuracy, and mentions algorithms other researchers have used to address this problem. It presents three intuitive noise-tolerant algorithms that can be used to prune instances from the training set. In experiments on 29 applications, the algorithm that achieves the highest reduction in storage also results in the highest generalization accuracy of the three methods.

