Results 1 -
9 of
9
Sample compression, learnability, and the Vapnik-Chervonenkis dimension
- MACHINE LEARNING
, 1995
"... Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function r ..."
Abstract
-
Cited by 55 (3 self)
- Add to MetaCart
Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixed-size for a class C is sufficient to ensure that the class C is pac-learnable. Previous work has shown that a class is pac-learnable if and only if the Vapnik-Chervonenkis (VC) dimension of the class i...
Learning in the Presence of Finitely or Infinitely Many Irrelevant Attributes
, 1995
"... This paper addresses the problem of learning boolean functions in query and mistake-bound ..."
Abstract
-
Cited by 46 (8 self)
- Add to MetaCart
This paper addresses the problem of learning boolean functions in query and mistake-bound
On PAC Learning using Winnow, Perceptron, and a Perceptron-Like Algorithm
"... In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear th ..."
Abstract
-
Cited by 18 (8 self)
- Add to MetaCart
In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear threshold functions. We also prove that the Perceptron algorithm cannot efficiently learn the unrestricted class of linear threshold functions even under the uniform distribution on boolean examples. However, we show that the Perceptron algorithm can efficiently PAC learn the class of nested functions (a concept class known to be hard for Perceptron under arbitrary distributions) under the uniform distribution on boolean examples. Finally, we give a very simple Perceptron-like algorithm for learning origin-centered halfspaces under the uniform distribution on the unit sphere in R^n. Unlike the Perceptron algorithm, which cannot learn in the presence of classification noise, the new algorithm can learn in the presence of monotonic noise (a generalization of classification noise). The new algorithm is significantly faster than previous algorithms in both the classification and monotonic noise settings.
On the Impact of Forgetting on Learning Machines
- Journal of the ACM
, 1993
"... this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that ..."
Abstract
-
Cited by 9 (3 self)
- Add to MetaCart
this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that learning algorithm can hold in its memory as it attempts to This work was facilitated by an international agreement under NSF Grant 9119540.
Toward Attribute Efficient Learning of Decision Lists and Parities
- In Proceedings of COLT
, 2006
"... We consider two well-studied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 . This is the first algorithm for learning decision lists that h ..."
Abstract
-
Cited by 7 (0 self)
- Add to MetaCart
We consider two well-studied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 . This is the first algorithm for learning decision lists that has both subexponential sample complexity and subexponential running time in the relevant parameters. Our approach establishes a relationship between attribute efficient learning and polynomial threshold functions and is based on a new construction of low degree, low weight polynomial threshold functions for decision lists. For a wide range of parameters our construction matches a lower bound due to Beigel for decision lists and gives an essentially optimal tradeoff between polynomial threshold function degree and weight.
PAGODA: A Model for Autonomous Learning in Probabilistic Domains
, 1992
"... as a testbed for designing intelligent agents. The system consists of an overall agent architecture and five components within the architecture. The five components are: 1. Goal-Directed Learning (GDL), a decision-theoretic method for selecting learning goals. 2. Probabilistic Bias Evaluation (PBE) ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
as a testbed for designing intelligent agents. The system consists of an overall agent architecture and five components within the architecture. The five components are: 1. Goal-Directed Learning (GDL), a decision-theoretic method for selecting learning goals. 2. Probabilistic Bias Evaluation (PBE), a technique for using probabilistic background knowledge to select learning biases for the learning goals. 3. Uniquely Predictive Theories (UPTs) and Probability Computation using Independence (PCI), a probabilistic representation and Bayesian inference method for the agent's theories. 4. A probabilistic learning component, consisting of a heuristic search algorithm and a Bayesian method for evaluating proposed theories. 5. A decision-theoretic probabilistic planner, which searches through the probability space defined by the agent's current theory to select the best action. PAGODA is given as input an initial planning goal (its ove
Trial and Error: A New Approach to Space-Bounded Learning
, 1993
"... A pac-learning algorithm is d-space bounded, if it stores at most d examples from the sample at any time. We characterize the d-space learnable concept classes. For this purpose we introduce the compression parameter of a concept class C and design our Trial and Error Learning Algorithm. We show ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
A pac-learning algorithm is d-space bounded, if it stores at most d examples from the sample at any time. We characterize the d-space learnable concept classes. For this purpose we introduce the compression parameter of a concept class C and design our Trial and Error Learning Algorithm. We show : C is d-space learnable if and only if the compression parameter of C is at most d. This learning algorithm does not produce a hypothesis consistent with the whole sample as previous approaches e.g. by Floyd, who presents consistent space bounded learning algorithms, but has to restrict herself to very special concept classes. On the other hand our algorithm needs large samples; the compression parameter appears as exponent in the sample size. We present several examples of polynomial time space bounded learnable concept classes: ffl all intersection closed concept classes with finite VC--dimension. ffl convex n-gons in IR 2 . ffl halfspaces in IR n . ffl unions of triangles...
Toward Attribute Efficient Learning Algorithms
"... Abstract We make the first progress on two important problems regarding attribute efficient learnability. First, we give an algorithm for learning decision lists of length k over n variables using ..."
Abstract
- Add to MetaCart
Abstract We make the first progress on two important problems regarding attribute efficient learnability. First, we give an algorithm for learning decision lists of length k over n variables using
PAC Analogues of Perceptron and Winnow via Boosting the Margin
- in \Proc. 13th Conf. on Comp. Learning Theory
, 2000
"... We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit complexity bounds remarkably similar to those of known online algorithms such as Perceptron and Winnow, thus suggesting that these w ..."
Abstract
- Add to MetaCart
We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit complexity bounds remarkably similar to those of known online algorithms such as Perceptron and Winnow, thus suggesting that these well-studied online algorithms in some sense correspond to instances of boosting. We show that the new algorithms can be viewed as natural PAC analogues of the online p-norm algorithms which have recently been studied by Grove, Littlestone, and Schuurmans [16] and Gentile and Littlestone [15]. As special cases of the algorithm, by taking p = 2 and p = 1 we obtain natural boostingbased PAC analogues of Perceptron and Winnow respectively. The p = 1 case of our algorithm can also be viewed as a generalization (with an improved sample complexity bound) of Jackson and Craven's PAC-model boosting-based algorithm for learning "sparse perceptrons" [20]. The analysis of the generalizatio...

