Results 1 
1 of
1
Optimal Brain Damage
 Advances in Neural Information Processing Systems
, 1990
"... We have used informationtheoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improve ..."
Abstract

Cited by 420 (5 self)
 Add to MetaCart
We have used informationtheoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved speed of learning and/or classification. The basic idea is to use secondderivative information to make a tradeoff between network complexity and training set error. Experiments confirm the usefulness of the methods on a realworld application. 1 INTRODUCTION Most successful applications of neural network learning to realworld problems have been achieved using highly structured networks of rather large size [for example (Waibel, 1989; LeCun et al., 1990)]. As applications become more complex, the networks will presumably become even larger and more structured. Design tools and techniques for comparing different architectures and minimizing the network size will be needed. More impor...