Results 1  10
of
10
Sample compression, learnability, and the VapnikChervonenkis dimension
 MACHINE LEARNING
, 1995
"... Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function r ..."
Abstract

Cited by 61 (3 self)
 Add to MetaCart
Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixedsize for a class C is sufficient to ensure that the class C is paclearnable. Previous work has shown that a class is paclearnable if and only if the VapnikChervonenkis (VC) dimension of the class i...
Learning in the Presence of Finitely or Infinitely Many Irrelevant Attributes
, 1995
"... This paper addresses the problem of learning boolean functions in query and mistakebound ..."
Abstract

Cited by 52 (8 self)
 Add to MetaCart
This paper addresses the problem of learning boolean functions in query and mistakebound
On PAC Learning using Winnow, Perceptron, and a PerceptronLike Algorithm
"... In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear th ..."
Abstract

Cited by 20 (9 self)
 Add to MetaCart
In this paper we analyze the PAC learning abilities of several simple iterative algorithms for learning linear threshold functions, obtaining both positive and negative results. We show that Littlestone’s Winnow algorithm is not an efficient PAC learning algorithm for the class of positive linear threshold functions. We also prove that the Perceptron algorithm cannot efficiently learn the unrestricted class of linear threshold functions even under the uniform distribution on boolean examples. However, we show that the Perceptron algorithm can efficiently PAC learn the class of nested functions (a concept class known to be hard for Perceptron under arbitrary distributions) under the uniform distribution on boolean examples. Finally, we give a very simple Perceptronlike algorithm for learning origincentered halfspaces under the uniform distribution on the unit sphere in R^n. Unlike the Perceptron algorithm, which cannot learn in the presence of classification noise, the new algorithm can learn in the presence of monotonic noise (a generalization of classification noise). The new algorithm is significantly faster than previous algorithms in both the classification and monotonic noise settings.
On the Impact of Forgetting on Learning Machines
 Journal of the ACM
, 1993
"... this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that ..."
Abstract

Cited by 10 (3 self)
 Add to MetaCart
this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that learning algorithm can hold in its memory as it attempts to This work was facilitated by an international agreement under NSF Grant 9119540.
Toward Attribute Efficient Learning of Decision Lists and Parities
 In Proceedings of COLT
, 2006
"... We consider two wellstudied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 . This is the first algorithm for learning decision lists that h ..."
Abstract

Cited by 9 (1 self)
 Add to MetaCart
We consider two wellstudied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 . This is the first algorithm for learning decision lists that has both subexponential sample complexity and subexponential running time in the relevant parameters. Our approach establishes a relationship between attribute efficient learning and polynomial threshold functions and is based on a new construction of low degree, low weight polynomial threshold functions for decision lists. For a wide range of parameters our construction matches a lower bound due to Beigel for decision lists and gives an essentially optimal tradeoff between polynomial threshold function degree and weight.
PAGODA: A Model for Autonomous Learning in Probabilistic Domains
, 1992
"... as a testbed for designing intelligent agents. The system consists of an overall agent architecture and five components within the architecture. The five components are: 1. GoalDirected Learning (GDL), a decisiontheoretic method for selecting learning goals. 2. Probabilistic Bias Evaluation (PBE) ..."
Abstract

Cited by 6 (2 self)
 Add to MetaCart
as a testbed for designing intelligent agents. The system consists of an overall agent architecture and five components within the architecture. The five components are: 1. GoalDirected Learning (GDL), a decisiontheoretic method for selecting learning goals. 2. Probabilistic Bias Evaluation (PBE), a technique for using probabilistic background knowledge to select learning biases for the learning goals. 3. Uniquely Predictive Theories (UPTs) and Probability Computation using Independence (PCI), a probabilistic representation and Bayesian inference method for the agent's theories. 4. A probabilistic learning component, consisting of a heuristic search algorithm and a Bayesian method for evaluating proposed theories. 5. A decisiontheoretic probabilistic planner, which searches through the probability space defined by the agent's current theory to select the best action. PAGODA is given as input an initial planning goal (its ove
Trial and Error: A New Approach to SpaceBounded Learning
, 1993
"... A paclearning algorithm is dspace bounded, if it stores at most d examples from the sample at any time. We characterize the dspace learnable concept classes. For this purpose we introduce the compression parameter of a concept class C and design our Trial and Error Learning Algorithm. We show ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
A paclearning algorithm is dspace bounded, if it stores at most d examples from the sample at any time. We characterize the dspace learnable concept classes. For this purpose we introduce the compression parameter of a concept class C and design our Trial and Error Learning Algorithm. We show : C is dspace learnable if and only if the compression parameter of C is at most d. This learning algorithm does not produce a hypothesis consistent with the whole sample as previous approaches e.g. by Floyd, who presents consistent space bounded learning algorithms, but has to restrict herself to very special concept classes. On the other hand our algorithm needs large samples; the compression parameter appears as exponent in the sample size. We present several examples of polynomial time space bounded learnable concept classes: ffl all intersection closed concept classes with finite VCdimension. ffl convex ngons in IR 2 . ffl halfspaces in IR n . ffl unions of triangles...
Toward Attribute Efficient Learning Algorithms
"... Abstract We make the first progress on two important problems regarding attribute efficient learnability. First, we give an algorithm for learning decision lists of length k over n variables using ..."
Abstract
 Add to MetaCart
Abstract We make the first progress on two important problems regarding attribute efficient learnability. First, we give an algorithm for learning decision lists of length k over n variables using
PAC Analogues of Perceptron and Winnow via Boosting the Margin
 in \Proc. 13th Conf. on Comp. Learning Theory
, 2000
"... We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit complexity bounds remarkably similar to those of known online algorithms such as Perceptron and Winnow, thus suggesting that these w ..."
Abstract
 Add to MetaCart
We describe a novel family of PAC model algorithms for learning linear threshold functions. The new algorithms work by boosting a simple weak learner and exhibit complexity bounds remarkably similar to those of known online algorithms such as Perceptron and Winnow, thus suggesting that these wellstudied online algorithms in some sense correspond to instances of boosting. We show that the new algorithms can be viewed as natural PAC analogues of the online pnorm algorithms which have recently been studied by Grove, Littlestone, and Schuurmans [16] and Gentile and Littlestone [15]. As special cases of the algorithm, by taking p = 2 and p = 1 we obtain natural boostingbased PAC analogues of Perceptron and Winnow respectively. The p = 1 case of our algorithm can also be viewed as a generalization (with an improved sample complexity bound) of Jackson and Craven's PACmodel boostingbased algorithm for learning "sparse perceptrons" [20]. The analysis of the generalizatio...
Harvard University
"... Abstract. We consider two wellstudied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 Õ(k1/3) log n examples and time n Õ(k 1/3). This is the first al ..."
Abstract
 Add to MetaCart
Abstract. We consider two wellstudied problems regarding attribute efficient learning: learning decision lists and learning parity functions. First, we give an algorithm for learning decision lists of length k over n variables using 2 Õ(k1/3) log n examples and time n Õ(k 1/3). This is the first algorithm for learning decision lists that has both subexponential sample complexity and subexponential running time in the relevant parameters. Our approach is based on a new construction of low degree, low weight polynomial threshold functions for decision lists. For a wide range of parameters our construction matches a lower bound due to Beigel for decision lists and gives an essentially optimal tradeoff between polynomial threshold function degree and weight. Second, we give an algorithm for learning an unknown parity function on k out of n variables using O(n 1−1/k) examples in poly(n) time. For k = o(log n) this yields the first polynomial time algorithm for learning parity on a superconstant number of variables with sublinear sample complexity. We also give a simple algorithm for learning an unknown sizek parity using O(k log n) examples in n k/2 time, which improves on the naive n k time bound of exhaustive search. 1