## Instance pruning techniques (1997)

Venue: | MACHINE LEARNING: PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE (ICML’97 |

Citations: | 75 - 10 self |

### BibTeX

@INPROCEEDINGS{Wilson97instancepruning,

author = {D. Randall Wilson and Tony R. Martinez},

title = {Instance pruning techniques},

booktitle = {MACHINE LEARNING: PROCEEDINGS OF THE FOURTEENTH INTERNATIONAL CONFERENCE (ICML’97},

year = {1997},

pages = {404--411},

publisher = {Morgan Kaufmann}

}

### Years of Citing Articles

### OpenURL

### Abstract

The nearest neighbor algorithm and its derivatives are often quite successful at learning a concept from a training set and providing good generalization on subsequent input vectors. However, these techniques often retain the entire training set in memory, resulting in large memory requirements and slow execution speed, as well as a sensitivity to noise. This paper provides a discussion of issues related to reducing the number of instances retained in memory while maintaining (and sometimes improving) generalization accuracy, and mentions algorithms other researchers have used to address this problem. It presents three intuitive noise-tolerant algorithms that can be used to prune instances from the training set. In experiments on 29 applications, the algorithm that achieves the highest reduction in storage also results in the highest generalization accuracy of the three methods.

### Citations

3085 |
UCI repository of machine learning databases
- Blake, Merz
- 1998
(Show Context)
Citation Context ...sing k = 3, and using the HVDM distance function described in Section 3.4. These algorithms were tested on 29 data sets from the University of California, Irvine Machine Learning Database Repository (=-=Merz & Murphy, 1996-=-) and compared to a k-nearest neighbor classifier that was identical to RT1 except that it does not remove any instances from the instance set (i.e., S=T). Each test consisted of ten trials, each usin... |

1144 | Instance-based learning algorithms
- Aha, Kibler, et al.
- 1991
(Show Context)
Citation Context ... series of instance-based learning algorithms that reduce storage. IB2 is quite similar to the Condensed Nearest Neighbor (CNN) rule (Hart, 1968), and suffers from the same sensitivity to noise. IB3 (=-=Aha et al. 1991-=-) addresses IB2's problem of keeping noisy instances by using a statistical test to retain only acceptable misclassified instances. In their experiments, IB3 was able to achieve greater reduction in t... |

969 |
Nearest neighbor pattern classification
- Cover, Hart
- 1967
(Show Context)
Citation Context ...29 applications, the algorithm that achieves the highest reduction in storage also results in the highest generalization accuracy of the three methods. 1. INTRODUCTION The nearest neighbor algorithm (=-=Cover & Hart, 1967-=-; Dasarathy, 1991) has been used successfully for pattern classification on many applications. Each pattern has an input vector with one value for each of several input attributes. An instance has an ... |

516 |
Toward Memory-based Reasoning
- Stanfill, Waltz
- 1986
(Show Context)
Citation Context ...hen nominal (discrete, unordered) attributes are included in an application, a distance metric is needed that can handle them. We use a distance function based upon the Value Difference Metric (VDM) (=-=Stanfill & Waltz, 1986) for nominal attributes-=-. A simplified version of the VDM defines the distance between two values x and y of a single attribute a as: vdm a (x, y) = N a,x,c N a,x - N a,y,c N a,y �� �� �� �� �� �� c=1... |

268 |
The Condensed Nearest Neighbor Rule
- Hart
- 1968
(Show Context)
Citation Context ...INSTANCE-BASED" LEARNING ALGORITHMS Aha et. al. (1991) presented a series of instance-based learning algorithms that reduce storage. IB2 is quite similar to the Condensed Nearest Neighbor (CNN) r=-=ule (Hart, 1968-=-), and suffers from the same sensitivity to noise. IB3 (Aha et al. 1991) addresses IB2's problem of keeping noisy instances by using a statistical test to retain only acceptable misclassified instance... |

233 | T.R.: Improved Heterogeneous Distance Functions
- Wilson, Martinez
- 1997
(Show Context)
Citation Context ...assifications, regardless of the order of the values. In order to handle heterogeneous applications---those with both numeric and nominal attributes---we use the heterogeneous distance function HVDM (=-=Wilson & Martinez, 1997), w-=-hich is defined as: HVDM(x, y) = d a 2 (x a , y a ) a=1 m �� (3) where the function d a (x,y) is the distance for attribute a and is defined as: d a (x, y) = vdm a (x, y), if a is nominal x - y 4s... |

204 |
Asymptotic properties of nearest neighbor rules using edited data
- Wilson
- 1972
(Show Context)
Citation Context ...tain a subset of the original instances, including the Condensed NN rule (CNN) (Hart, 1968), the Reduced NN rule (RNN) (Gates 1972), the Selective NN rule (SNN) (Ritter et. al., 1975), Wilson's rule (=-=Wilson, 1972), the &qu-=-ot;all k-NN" method (Tomek, 1976), Instance-Based (IBL) Algorithms (Aha et. al. 1991), and the Typical Instance Based Learning (TIBL) algorithm (Zhang, 1992). Another decision that affects the co... |

178 | A Nearest Hyperrectangle Learning Method
- Salzberg
- 1991
(Show Context)
Citation Context ...oice in designing a training set reduction algorithm is to decide whether to retain a subset of the original instances or whether to modify the instances using a new representation. For example, NGE (=-=Salzberg, 1991-=-) and its derivatives (Wettschereck & Dietterich, 1995) use hyperrectangles to represent collections of instances; RISE (Domingos, 1995) generalizes instances into rules; and prototypes (Chang 1974) c... |

132 |
The reduced nearest neighbor rule
- Gates
- 1972
(Show Context)
Citation Context ... hypothesized to provide substantial instance reduction while continuing to generalize accurately, even in the presence of noise. The first is similar to the Reduced Nearest Neighbor (RNN) algorithm (=-=Gates 1972-=-). The second changes the order in which instances are considered for removal, and the 2 third adds a noise-reduction step similar to that done by Wilson (1972) before proceeding with the main reducti... |

99 | An experimental comparison of the nearestneighbor and nearest-hyperrectangle algorithms
- Wettschereck, Dietterich
- 1995
(Show Context)
Citation Context ...ction algorithm is to decide whether to retain a subset of the original instances or whether to modify the instances using a new representation. For example, NGE (Salzberg, 1991) and its derivatives (=-=Wettschereck & Dietterich, 1995-=-) use hyperrectangles to represent collections of instances; RISE (Domingos, 1995) generalizes instances into rules; and prototypes (Chang 1974) can be used to represent a cluster of instances, even i... |

83 |
Finding prototypes for nearest neighbor clasifiers
- Chang
- 1974
(Show Context)
Citation Context ...alzberg, 1991) and its derivatives (Wettschereck & Dietterich, 1995) use hyperrectangles to represent collections of instances; RISE (Domingos, 1995) generalizes instances into rules; and prototypes (=-=Chang 1974-=-) can be used to represent a cluster of instances, even if no original instance occurred at the point where the prototype is located. On the other hand, many models seek to retain a subset of the orig... |

70 |
Nearest Neighbor (NN) Norms
- Dasarathy
- 1991
(Show Context)
Citation Context ... algorithm that achieves the highest reduction in storage also results in the highest generalization accuracy of the three methods. 1. INTRODUCTION The nearest neighbor algorithm (Cover & Hart, 1967; =-=Dasarathy, 1991-=-) has been used successfully for pattern classification on many applications. Each pattern has an input vector with one value for each of several input attributes. An instance has an input vector and ... |

70 |
An Experiment with the Edited Nearest-Neighbor Rule
- Tomek
- 1976
(Show Context)
Citation Context ...s, including the Condensed NN rule (CNN) (Hart, 1968), the Reduced NN rule (RNN) (Gates 1972), the Selective NN rule (SNN) (Ritter et. al., 1975), Wilson's rule (Wilson, 1972), the "all k-NN"=-=; method (Tomek, 1976-=-), Instance-Based (IBL) Algorithms (Aha et. al. 1991), and the Typical Instance Based Learning (TIBL) algorithm (Zhang, 1992). Another decision that affects the concept description for many algorithms... |

64 | Rule Induction and Instance-Based Learning A Unified Approach
- Domingos
- 1995
(Show Context)
Citation Context ...dify the instances using a new representation. For example, NGE (Salzberg, 1991) and its derivatives (Wettschereck & Dietterich, 1995) use hyperrectangles to represent collections of instances; RISE (=-=Domingos, 1995-=-) generalizes instances into rules; and prototypes (Chang 1974) can be used to represent a cluster of instances, even if no original instance occurred at the point where the prototype is located. On t... |

38 | A Hybrid Nearest-Neighbor and Nearest-Hyperrectangle Algorithm - Wettschereck |

29 |
Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing
- Cameron-Jones
- 1995
(Show Context)
Citation Context ... in Section 2, including CNN (Hart, 1968), SNN (Ritter et al., 1975), Wilson's Rule (Wilson, 1972), the "All k-NN" method (Tomek, 1976), IB2, IB3 (Aha, Kibler & Albert, 1991), and the Explor=-=e method (Cameron-Jones, 1995-=-). Space does not permit the inclusion of all of the results, but the results for IB3 are included in Table 1 as a benchmark for comparison, since it is one of the most popular algorithms. This versio... |

18 |
A Worst-Case Analysis of Nearest Neighbor Searching by Projection
- Papadimitriou, Bentley
- 1980
(Show Context)
Citation Context ...input vector or output class, or those not representative of typical cases) are stored as well, which can degrade generalization accuracy. Techniques such as k-d trees (Sproull, 1991) and projection (=-=Papadimitriou & Bentley, 1980-=-) can reduce the time required to find the nearest neighbor(s) of an input vector, but they do not reduce storage requirements, do not address the problem of noise, and often become much less effectiv... |

7 |
Refinements to nearest-neighbor searching in k -dimensional trees
- Sproull
- 1991
(Show Context)
Citation Context ...i.e., those with errors in the input vector or output class, or those not representative of typical cases) are stored as well, which can degrade generalization accuracy. Techniques such as k-d trees (=-=Sproull, 1991-=-) and projection (Papadimitriou & Bentley, 1980) can reduce the time required to find the nearest neighbor(s) of an input vector, but they do not reduce storage requirements, do not address the proble... |

2 | An Algorithm for a Selective Nearest Neighbor Decision Rule - Isenhour - 1975 |