Results 1 -
2 of
2
Flat Minima
, 1997
"... this paper (available on the World-Wide Web; see our home pages) contains pseudo-code of an efficient implementation. It is based on fast multiplication of the Hessian and a vector due to Pearlmutter (1994) and Mller (1993). Acknowledgments ..."
Abstract
-
Cited by 32 (13 self)
- Add to MetaCart
this paper (available on the World-Wide Web; see our home pages) contains pseudo-code of an efficient implementation. It is based on fast multiplication of the Hessian and a vector due to Pearlmutter (1994) and Mller (1993). Acknowledgments
Simplifying Neural Nets by Discovering Flat Minima
- Advances in Neural Information Processing Systems 7
, 1995
"... We present a new algorithm for finding low complexity networks with high generalization capability. The algorithm searches for large connected regions of so-called "fiat" minima of the error function. In the weightspace environment of a "fiat" minimum, the error remains approximately constant. U ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present a new algorithm for finding low complexity networks with high generalization capability. The algorithm searches for large connected regions of so-called "fiat" minima of the error function. In the weightspace environment of a "fiat" minimum, the error remains approximately constant. Using an MDL-based argument, fiat minima can be shown to correspond to low expected overfitting. Although our algorithm requires the computation of second order derivatives, it has backprop's order of complexity. Experiments with feedforward and recurrent nets are described.

