Abstract:
Active learning differs from passive "learning from examples" in that the learning algorithm assumes at least some control over what part of the input domain it receives information about. In some situations, active learning is provably more powerful that learning from examples alone, giving better generalization for a fixed number of training examples. In this paper, we consider the problem of learning a binary concept in the absence of noise (Valiant 1984). We describe a formalism for active concept learning called selective sampling, and show how it may be approximately implemented by a neural network. In selective sampling, a learner receives distribution information from the environment and queries an oracle on parts of the domain it considers "useful." We test our implementation, called an SG-network, on three domains, and observe significant improvement in generalization.
Citations
|
2067
|
Learning internal representations by error propagation
– Rumelhart, Hinton, et al.
- 1986
|
|
532
|
Learnability and the VapnikChervonenkis Dimension
– Blumer, Ehrenfeucht, et al.
- 1989
|
|
486
|
Generalization as search
– Mitchell
- 1990
|
|
380
|
Learning regular sets from queries and counterexamples
– Angluin
- 1987
|
|
361
|
Optimal brain damage
– LeCun, Denker, et al.
- 1990
|
|
284
|
D.: What Size Net Gives Valid Generalization
– Baum, Haussler
- 1989
|
|
178
|
Information-based objective functions for active data selection
– MacKay
|
|
176
|
Training a 3-node neural network is NP-complete. Neural Networks 5(1):117–127
– Blum, Rivest
- 1992
|
|
99
|
Dynamic node creation in backpropagation networks
– Ash
- 1989
|
|
62
|
Learning conjunctive concepts in structural domains
– Haussler
- 1989
|
|
47
|
Training connectionist networks with queries and selective sampling
– Cohn, Atlas, et al.
- 1990
|
|
36
|
Discriminability-based transfer between neural networks
– Pratt
- 1993
|
|
30
|
Constructing hidden units using examples and queries
– Baum, Lang
- 1991
|
|
27
|
On the complexity of loading shallow neural networks
– Judd
- 1988
|
|
25
|
Information, prediction, and query by committee
– Freud, Seung, et al.
- 1992
|
|
21
|
On the sample complexity of pac-learning using random and chosen examples
– Eisenberg
- 1990
|
|
19
|
Decision-theoretic generalizations of the PAC model for neural networks and other applications
– Haussler
- 1992
|
|
18
|
Generalizing the PAC model for neural net and other learning applications
– Haussler
- 1989
|
|
14
|
Acoustic determinants of infant preference for motherese speech. Infant Behaviour and Development
– Fernald, Kuhl
- 1987
|
|
13
|
How tight are the Vapnik-Chervonenkis bounds
– Cohn, Tesauro
- 1992
|
|
8
|
How tight are the Vapnik-Chervonenkis bounds? Neural Comput
– Cohn, Tesauro
- 1992
|
|
6
|
Query learning based on boundary search and gradient computation of trained multilayer perceptrons
– Hwang, Choi, et al.
- 1990
|
|
2
|
Artificial neural networks for power system static security assessment
– Aggoune, Atlas, et al.
- 1989
|
|
1
|
Query by committee
– Sompolinsky
- 1992
|