## Improving Model Accuracy using Optimal Linear Combinations of Trained Neural Networks (1992)

Venue: | IEEE Transactions on Neural Networks |

Citations: | 41 - 3 self |

### BibTeX

@ARTICLE{Hashem92improvingmodel,

author = {Sherif Hashem and Bruce Schmeiser},

title = {Improving Model Accuracy using Optimal Linear Combinations of Trained Neural Networks},

journal = {IEEE Transactions on Neural Networks},

year = {1992},

volume = {6},

pages = {792--794}

}

### Years of Citing Articles

### OpenURL

### Abstract

Neural network (NN) based modeling often requires trying multiple networks with different architectures and training parameters in order to achieve an acceptable model accuracy. Typically, only one of the trained networks is selected as "best" and the rest are discarded. We propose using optimal linear combinations (OLCs) of the corresponding outputs of a set of NNs as an alternative to using a single network. Modeling accuracy is measured by mean squared error (MSE) with respect to the distribution of random inputs. Optimality is defined by minimizing the MSE, with the resultant combination referred to as MSE-OLC. We formulate the MSE-OLC problem for trained NNs and derive two closed-form expressions for the optimal combination-weights. An example that illustrates significant improvement in model accuracy as a result of using MSE-OLCs of the trained networks is included. I. INTRODUCTION Constructing neural network (NN) based models often involves training a number of networks. The cr...

### Citations

1909 | Introduction to the Theory of Neural Computation - Hertz, Krogh, et al. - 1991 |

643 |
Neural networks and the bias/variance dilemma
- Geman, Bienenstock, et al.
- 1992
(Show Context)
Citation Context ... training a number of networks. The creation of these networks may result during the search for a mixture of network architecture and training parameters that yields an "acceptable" model pe=-=rformance [1, 2]. Typically, the "be-=-st" performer is selected while the rest are discarded. In other situations, a number of "small" networks may be individually trained and then combined, instead of training one "la... |

543 |
Neural network ensembles
- Hansen, Salamon
- 1990
(Show Context)
Citation Context ... training a number of networks. The creation of these networks may result during the search for a mixture of network architecture and training parameters that yields an "acceptable" model pe=-=rformance [1, 2]. Typically, the "be-=-st" performer is selected while the rest are discarded. In other situations, a number of "small" networks may be individually trained and then combined, instead of training one "la... |

306 | When Networks Disagree: Ensemble Method for Neural Networks
- Perrone, Coopler
- 1993
(Show Context)
Citation Context ... h E i ffi i ( ~ X) ffi j ( ~ X) ji is a p \Theta p matrix, and ~ 1 is a p \Theta 1 vector with all components equal to one. Equations 1 and 2 are derived in the appendix. The General Ensemble Method =-=[6]-=-, developed independently, is similar to our constrained MSE-OLC. B. Estimating the MSE-OLC Weights In practice, one seldom knows the multivariate distribution F ~ X . Thus, \Phi; \Theta; and\Omega in... |

256 |
Combining forecasts: a review and annotated bibliography
- Clemen
- 1989
(Show Context)
Citation Context ...ty, West Lafayette, IN 47907--1287. Internet: schmeise@ecn.purdue.edu II. COMBINING ESTIMATORS Linear combinations of estimators have been used by the statistics community for a long time [3]. Clemen =-=[4]-=- cites more than 200 studies in his review of the literature related to combining forecasts, including contributions from forecasting, psychology, statistics, and management science literatures. Avera... |

132 | Optimal linear combinations of neural networks
- Hashem
- 1997
(Show Context)
Citation Context ... Such a small difference arises because unconstrained combination-weights for accurate component networks (such as seen in this example) tend to automatically sum to one. The empirical comparisons in =-=[9]-=- show that the effect of constraining the weights is greater for less-accurate component networks. Thus in this example, using MSE-OLCs of the trained NNs significantly improves model accuracy compare... |

108 |
Improved methods of combining forecasts
- Granger, Ramanathan
- 1984
(Show Context)
Citation Context ...tors is frequently compared to the individual estimators, and in many cases performs better [2, 3, 4]. Our approach is similar to the approaches adopted in the forecasting literature (for example see =-=[5]-=-), with differences primarily in the problem formulation and the definitions discussed in Section III. III. OPTIMAL LINEAR COMBINATIONS OF NEURAL NETWORKS In this section, we formulate the optimal lin... |

45 |
Combining forecasts - Twenty years later
- Granger
- 1989
(Show Context)
Citation Context ...due University, West Lafayette, IN 47907--1287. Internet: schmeise@ecn.purdue.edu II. COMBINING ESTIMATORS Linear combinations of estimators have been used by the statistics community for a long time =-=[3]-=-. Clemen [4] cites more than 200 studies in his review of the literature related to combining forecasts, including contributions from forecasting, psychology, statistics, and management science litera... |

2 |
Improving the generalising capabilities of a back-propagation network
- Namatame, Kimata
- 1989
(Show Context)
Citation Context ...D. IV. EXAMPLE AND DISCUSSION Consider the problem of approximating the function t(X) = 0:02 i 12 + 3X \Gamma 3:5X 2 + 7:2X 3 j (1 + cos 4��X) (1 + 0:8 sin 3��X) over the interval [0; 1], repo=-=rted in [7]-=-. The range of t(X) is [0; 0:9). We train three 2-hidden-layers NNs with 5 hidden units in each hidden layer (NN1, NN2, and NN3); and three 1-hidden-layer NNs with 10 hidden units (NN4, NN5, and NN6) ... |