MetaCart Sign in to MyCiteSeerX

Include Citations | Advanced Search | Help

Disambiguated Search | Include Citations | Advanced Search | Help

Active Learning with Statistical Models (1996) [301 citations — 6 self]

Abstract:

For manytypes of machine learning algorithms, one can compute the statistically "optimal" way to select training data. In this paper, we review how optimal data selection techniques have been used with feedforward neural networks. We then showhow the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are computationally expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate. Empirically, we observe that the optimality criterion sharply decreases the number of training examples the learner needs in order to achieve good performance.

Citations

4821 Maximum-likelihood from incomplete data via the EM algorithm – Dempster, Laird, et al. - 1977
4776 Probabilistic reasoning in intelligent systems: networks of plausible inference – Pearl - 1988
623 Learning Bayesian networks: The combination of knowledge and statistical data – Heckerman, Geiger, et al. - 1994
500 Queries and concept learning – Angluin - 1988
422 Statistical analysis of finite mixture distributions – Titterington, Smith, et al. - 1985
237 Empirical Model Building and Response Surfaces – Box, Draper - 1987
232 Improving generalization with active learning – Cohn, Atlas, et al. - 1994
193 Theory of Optimal Experiments – Fedorov - 1972
178 Information-based objective functions for active data selection – MacKay
134 Supervised learning from incomplete data via an EM approach – Ghahramani, Jordan - 1994
112 Applied Linear Regression – Weisberg - 1980
105 A general regression neural network – Specht - 1991
94 Neural network exploration using optimal experiment design – Cohn - 1994
72 Soft competitive adaptation: Neural network learning algorithms based on fitting statistical mixtures – Nowlan - 1991
69 Robot juggling: An implementation of memory-based learning. Control Systems Magazine – Schaal, Atkeson - 1994
51 Active exploration in dynamic environments – Thrun, Moeller - 1992
47 Training connectionist networks with queries and selective sampling – Cohn, Atlas, et al. - 1990
42 Selecting concise training sets from clean data – Franco, Plutowski, et al. - 1993
34 Bayesian Classification – Cheeseman, Self, et al. - 1988
27 Optimal Control Systems – Fe’ldbaum - 1965
17 Reinforcement driven information acquisition in non-deterministic environments – Storck, Hochreiter, et al. - 1995
16 Bayesian query construction for neural network models – Paas, Kindermann - 1995
10 Regression By Local Fitting – Cleveland, Devlin, et al. - 1988
4 Bayesian classi cation – Cheeseman, Self, et al. - 1988
4 Implementing inner drive by competence reflection – Linden, Weber - 1993
3 Neural network algorithms that learn in polynomial time from examples and queries – Baum - 1991
3 Regression by local tting – Cleveland, Devlin, et al. - 1988
2 Implementing Inner Drive by Competence Re ection – Linden, Weber - 1993
1 Minimizing statistical bias with queries. AI Lab memo AIM1552, Massachusetts Institute of Technology. Available by anonymous ftp from publications.ai.mit.edu – Cohn - 1995
1 Active Learning with Statistical Models Geman – Bienenstock, E - 1992