Abstract:
: We consider a 2-layer, 3-node, n-input neural network whose nodes compute linear threshold functions of their inputs. We show that it is NP-complete to decide whether there exist weights and thresholds for this network so that it produces output consistent with a given set of training examples. We extend the result to other simple networks. We also present a network for which training is hard but where switching to a more powerful representation makes training easier. These results suggest that those looking for perfect training algorithms cannot escape inherent computational difficulties just by considering only simple or very regular networks. They also suggest the importance, given a training problem, of finding an appropriate network and input encoding for that problem. It is left as an open problem to extend our result to nodes with non-linear functions such as sigmoids. Keywords: Neural networks, computational complexity, NP-completeness, intractability, learning, training, mu...
Citations
|
7271
|
Computers and Intractability - A Guide to the Theory of NP-Completeness
– Garey, Johnson
- 1979
|
|
2044
|
Learning internal representations by error propagation
– Rumelhart, G, et al.
- 1986
|
|
383
|
Parallel networks that learn to pronounce English text
– Sejnowski, Rosenberg
- 1987
|
|
281
|
What size net gives valid generalization
– Baum, Haussler
- 1989
|
|
242
|
Cryptographic limitations on learning boolean formulae and finite automata
– Kearns, Valiant
- 1994
|
|
140
|
On the learnability of Boolean formulae
– Kearns, Li, et al.
- 1987
|
|
82
|
Neural Network Design and the Complexity of Learning", MIT-Press
– Judd
- 1990
|
|
57
|
Improving the Performance Guarantee for Approximate Graph Coloring
– Wigderson
- 1983
|
|
34
|
On the complexity of polyhedral separability
– Megiddo
- 1996
|
|
34
|
Scaling Relationships in Back-propagation Learning: Dependence on Predicate Order
– Tesauro, Janssens
- 1988
|
|
33
|
Training a 3-node neural net is NP-Complete
– Blum, Rivest
- 1989
|
|
18
|
Generalizing the PAC model for neural net and other learning applications
– Haussler
- 1989
|
|
13
|
An ~ O(n 0:4 )-approximation algorithm for 3-coloring (and improved approximation algorithms for k-coloring
– Blum
- 1989
|
|
13
|
Learning in threshold networks
– Raghavan
- 1988
|
|
2
|
On the computational complexity of training simple neural networks. Master 's thesis
– Blum
- 1989
|
|
2
|
Sigmoids distinguish better than Heavisides
– Sontag
- 1989
|
|
1
|
Predicting symmetric differences of two halfspaces reduces to predicting halfspaces. Unpublished manuscript
– Valiant, Warmuth
- 1989
|