Flat Minimum Search Finds Simple Nets (1994)
| Citations: | 3 - 2 self |
BibTeX
@TECHREPORT{Hochreiter94flatminimum,
author = {Sepp Hochreiter and Jürgen Schmidhuber},
title = {Flat Minimum Search Finds Simple Nets},
institution = {},
year = {1994}
}
OpenURL
Abstract
We present a new algorithm for finding low complexity neural networks with high generalization capability. The algorithm searches for a "flat" minimum of the error function. A flat minimum is a large connected region in weight-space where the error remains approximately constant. An MDL-based argument shows that flat minima correspond to low expected overfitting. Although our algorithm requires the computation of second order derivatives, it has backprop's order of complexity. Automatically, it effectively prunes units, weights, and input lines. Various experiments with feedforward and recurrent nets are described. In an application to stock market prediction, flat minimum search outperforms (1) conventional backprop, (2) weight decay, (3) "optimal brain surgeon" / "optimal brain damage".







