Results 1 
3 of
3
Active Learning with Statistical Models
, 1995
"... For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statist ..."
Abstract

Cited by 657 (12 self)
 Add to MetaCart
(Show Context)
For manytypes of learners one can compute the statistically "optimal" way to select data. We review how these techniques have been used with feedforward neural networks [MacKay, 1992# Cohn, 1994]. We then showhow the same principles may be used to select data for two alternative, statisticallybased learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.
Less is more: Active learning with support vector machines
, 2000
"... We describe a simple active learning heuristic which greatly enhances the generalization behavior of support vector machines (SVMs) on several practical document classification tasks. We observe a number of benefits, the most surprising of which is that a SVM trained on a wellchosen subset of the av ..."
Abstract

Cited by 265 (1 self)
 Add to MetaCart
(Show Context)
We describe a simple active learning heuristic which greatly enhances the generalization behavior of support vector machines (SVMs) on several practical document classification tasks. We observe a number of benefits, the most surprising of which is that a SVM trained on a wellchosen subset of the available corpus frequently performs better than one trained on all available data. The heuristic for choosing this subset is simple to compute, and makes no use of information about the test set. Given that the training time of SVMs depends heavily on the training set size, our heuristic not only offers better performance with fewer data, it frequently does so in less time than the naive approach of training on all available data. 1.
Query Learning for Maximum Information Gain in a MultiLayer Neural Network
 Mathematics of Neural Networks: Models, Algorithms and Applications
, 1997
"... this paper queries which maximize the expected information gain, which are related to the criterion of (Bayes) Doptimality in optimal experimental design. The generalization performance achieved by maximum information gain queries is by now well understood for singlelayer neural networks such as l ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
this paper queries which maximize the expected information gain, which are related to the criterion of (Bayes) Doptimality in optimal experimental design. The generalization performance achieved by maximum information gain queries is by now well understood for singlelayer neural networks such as linear and binary perceptrons [1, 2, 3]. For multilayer networks, which are much more widely used in This work was partially supported by European Union grant no. ERB CHRXCT920063