Results 1 - 10
of
24
Training A 3-Node Neural Network Is NP-Complete
, 1992
"... : We consider a 2-layer, 3-node, n-input neural network whose nodes compute linear threshold functions of their inputs. We show that it is NP-complete to decide whether there exist weights and thresholds for this network so that it produces output consistent with a given set of training examples. We ..."
Abstract
-
Cited by 186 (2 self)
- Add to MetaCart
: We consider a 2-layer, 3-node, n-input neural network whose nodes compute linear threshold functions of their inputs. We show that it is NP-complete to decide whether there exist weights and thresholds for this network so that it produces output consistent with a given set of training examples. We extend the result to other simple networks. We also present a network for which training is hard but where switching to a more powerful representation makes training easier. These results suggest that those looking for perfect training algorithms cannot escape inherent computational difficulties just by considering only simple or very regular networks. They also suggest the importance, given a training problem, of finding an appropriate network and input encoding for that problem. It is left as an open problem to extend our result to nodes with non-linear functions such as sigmoids. Keywords: Neural networks, computational complexity, NP-completeness, intractability, learning, training, mu...
An experimental and theoretical comparison of model selection methods. Machine Learning 27
, 1997
"... In the model selection problem, we must balance the complexity of a statistical model with its goodness of fit to the training data. This problem arises repeatedly in statistical estimation, machine learning, and scientific inquiry in general. ..."
Abstract
-
Cited by 101 (5 self)
- Add to MetaCart
In the model selection problem, we must balance the complexity of a statistical model with its goodness of fit to the training data. This problem arises repeatedly in statistical estimation, machine learning, and scientific inquiry in general.
PAC Learning Intersections of Halfspaces with Membership Queries
- ALGORITHMICA
, 1998
"... A randomized learning algorithm Polly is presented that efficiently learns intersections of s halfspaces in n dimensions, in time polynomial in both s and n. The learning protocol is the "PAC" (probably approximately correct) model of Valiant, augmented with membership queries. In particular, Polly ..."
Abstract
-
Cited by 21 (1 self)
- Add to MetaCart
A randomized learning algorithm Polly is presented that efficiently learns intersections of s halfspaces in n dimensions, in time polynomial in both s and n. The learning protocol is the "PAC" (probably approximately correct) model of Valiant, augmented with membership queries. In particular, Polly receives a set S of m = poly(n; s; 1=ffl; 1=ffi) randomly generated points from an arbitrary distribution over the unit hypercube, and is told exactly which points are contained in, and which points are not contained in, the convex polyhedron P defined by the halfspaces. Polly may also obtain the same information about points of its own choosing. It is shown that after poly(n, s, 1=ffl, 1=ffi, log(1=d)) time, the probability that Polly fails to output a collection of s halfspaces with classification error at most ffl, is at most ffi . Here, d is the minimum distance between the boundary of the target and those examples in S that are not lying on the boundary. The parameter log(1=d) can be ...
Complexity Theoretic Hardness Results for Query Learning
- Computational Complexity
, 1998
"... We investigate the complexity of learning for the well-studied model in which the learning algorithm may ask membership and equivalence queries. While complexity theoretic techniques have previously been used to prove hardness results in various learning models, these techniques typically are no ..."
Abstract
-
Cited by 17 (5 self)
- Add to MetaCart
We investigate the complexity of learning for the well-studied model in which the learning algorithm may ask membership and equivalence queries. While complexity theoretic techniques have previously been used to prove hardness results in various learning models, these techniques typically are not strong enough to use when a learning algorithm may make membership queries. We develop a general technique for proving hardness results for learning with membership and equivalence queries (and for more general query models). We apply the technique to show that, assuming NP 6= co-NP, no polynomial-time membership and (proper) equivalence query algorithms exist for exactly learning read-thrice DNF formulas, unions of k 3 halfspaces over the Boolean domain, or some other related classes. Our hardness results are representation dependent, and do not preclude the existence of representation independent algorithms.
Overfitting and Undercomputing in Machine Learning
- Computing Surveys
, 1995
"... suggests a reasonable line of research: find algorithms that can search the hypothesis class better. Hence, there is been extensive research in applying second-order methods to fit neural networks and in conducting much more thorough searches in learning decision trees and rule sets. Ironically, wh ..."
Abstract
-
Cited by 17 (0 self)
- Add to MetaCart
suggests a reasonable line of research: find algorithms that can search the hypothesis class better. Hence, there is been extensive research in applying second-order methods to fit neural networks and in conducting much more thorough searches in learning decision trees and rule sets. Ironically, when these algorithms were tested on real datasets, it was found that their performance was often worse than simple gradient descent or greedy search [3, 5]. In short: it appears to be better not to optimize! One of the other important trends in machine learning research has been the establishment and nurturing of connections between various previously-disparate fields including computational learning theory, connectionist learning, symbolic learning, and statistics. The connection to statistics was crucial in resolving this paradox. The key problem arises from the structure of the machine learning task. A learning algorithm is trained on a set of training data, but then it is applied to make
Probabilistic Analysis of Learning in Artificial Neural Networks: The PAC Model and its Variants
, 1994
"... There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model ..."
Abstract
-
Cited by 16 (4 self)
- Add to MetaCart
There are a number of mathematical approaches to the study of learning and generalization in artificial neural networks. Here we survey the `probably approximately correct' (PAC) model of learning and some of its variants. These models, much-studied since the introduction of the basic PAC model by Valiant in 1984, provide a probabilistic framework for the discussion of generalization and learning. CONTENTS 3 Contents 1 Introduction 4 2 The Basic PAC Model of Learning 5 3 VC-Dimension and Growth Function 8 4 VC-Dimension and Linear Dimension 10 5 A Useful Probability Theorem 12 6 PAC Learning and the VC-Dimension 16 7 VC-Dimension of Binary-Output Networks 19 7.1 Introduction 19 7.2 Linearly weighted neural networks 21 7.3 Linear threshold networks 22 7.4 Other activation functions 26 7.5 The effect of weight restrictions 29 8 Computational Complexity of Learning 30 9 Stochastic Concepts 36 10 Distribution-Specific Learning 39 11 Graph Dimension and Multiple-Output Nets 42 11.1 T...
On the Complexity of Optimization Problems for 3-Dimensional Convex Polyhedra and Decision Trees
- Comput. Geom. Theory Appl
, 1995
"... We show that several well-known optimization problems involving 3-dimensional convex polyhedra and decision trees are NP-hard or NP-complete. One of the techniques we employ is a linear-time method for realizing a planar 3-connected triangulation as a convex polyhedron, which may be of independent i ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
We show that several well-known optimization problems involving 3-dimensional convex polyhedra and decision trees are NP-hard or NP-complete. One of the techniques we employ is a linear-time method for realizing a planar 3-connected triangulation as a convex polyhedron, which may be of independent interest. Key words: Convex polyhedra, approximation, Steinitz's theorem, planar graphs, art gallery theorems, decision trees. 1 Introduction Convex polyhedra are fundamental geometric structures (e.g., see [20]). They are the product of convex hull algorithms, and are key components for problems in robot motion planning and computer-aided geometric design. Moreover, due to a beautiful theorem of Steinitz [20, 38], they provide a strong link between computational geometry and graph theory, for Steinitz shows that a graph forms the edge structure of a convex polyhedra if and only if it is planar and 3-connected. Unfortunately, algorithmic problems dealing with 3-dimensional convex polyhedra ...
Noise-Tolerant Distribution-Free Learning of General Geometric Concepts
, 1996
"... this paper. First, we give an algorithm to learn C ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
this paper. First, we give an algorithm to learn C
Induction over the unexplained: Using overly-general domain theories to aid concept learning
, 1993
"... This paper describes and evaluates an approach to combining empirical and explanationbased learning called Induction Over the Unexplained (IOU). IOU is intended for learning concepts that can be partially explained by an overly-general domain theory. An eclectic evaluation of the method is presented ..."
Abstract
-
Cited by 14 (0 self)
- Add to MetaCart
This paper describes and evaluates an approach to combining empirical and explanationbased learning called Induction Over the Unexplained (IOU). IOU is intended for learning concepts that can be partially explained by an overly-general domain theory. An eclectic evaluation of the method is presented which includes results from all three major approaches: empirical, theoretical, and psychological. Empirical results shows that IOU is effective at refining overlygeneral domain theories and that it learns more accurate concepts from fewer examples than a purely empirical approach. The application of theoretical results from PAC learnability theory explains why IOU requires fewer examples. IOU is also shown to be able to model psychological data demonstrating the effect of background knowledge on human learning.

