Results 1  10
of
26
The strength of weak learnability
 Machine Learning
, 1990
"... Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with h ..."
Abstract

Cited by 861 (24 self)
 Add to MetaCart
Abstract. This paper addresses the problem of improving the accuracy of an hypothesis output by a learning algorithm in the distributionfree (PAC) learning model. A concept class is learnable (or strongly learnable) if, given access to a Source of examples of the unknown concept, the learner with high probability is able to output an hypothesis that is correct on all but an arbitrarily small fraction of the instances. The concept class is weakly learnable if the learner can produce an hypothesis that performs only slightly better than random guessing. In this paper, it is shown that these two notions of learnability are equivalent. A method is described for converting a weak learning algorithm into one that achieves arbitrarily high accuracy. This construction may have practical applications as a tool for efficiently converting a mediocre learning algorithm into one that performs extremely well. In addition, the construction has some interesting theoretical consequences, including a set of general upper bounds on the complexity of any strong learning algorithm as a function of the allowed error e.
Sample compression, learnability, and the VapnikChervonenkis dimension
 MACHINE LEARNING
, 1995
"... Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function r ..."
Abstract

Cited by 83 (5 self)
 Add to MetaCart
Within the framework of paclearning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size k for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function receives a finite sample set consistent with some concept in C and chooses a subset of k examples as the compression set. The reconstruction function forms a hypothesis on X from a compression set of k examples. For any sample set of a concept in C the compression set produced by the compression function must lead to a hypothesis consistent with the whole original sample set when it is fed to the reconstruction function. We demonstrate that the existence of a sample compression scheme of fixedsize for a class C is sufficient to ensure that the class C is paclearnable. Previous work has shown that a class is paclearnable if and only if the VapnikChervonenkis (VC) dimension of the class i...
Teaching a Smarter Learner
 Journal of Computer and System Sciences
, 1994
"... We introduce a formal model of teaching in which the teacher is tailored to a particular learner, yet the teaching protocol is designed so that no collusion is possible. Not surprisingly, such a model remedies the nonintuitive aspects of other models in which the teacher must successfully teach ..."
Abstract

Cited by 48 (1 self)
 Add to MetaCart
(Show Context)
We introduce a formal model of teaching in which the teacher is tailored to a particular learner, yet the teaching protocol is designed so that no collusion is possible. Not surprisingly, such a model remedies the nonintuitive aspects of other models in which the teacher must successfully teach any consistent learner. We prove that any class that can be exactly identified by a deterministic polynomialtime algorithm with access to a very rich set of examplebased queries is teachable by a computationally unbounded teacher and a polynomialtime learner. In addition, we present other general results relating this model of teaching to various previous results. We also consider the problem of designing teacher/learner pairs in which both the teacher and learner are polynomialtime algorithms and describe teacher/learner pairs for the classes of 1decision lists and Horn sentences. 1 Introduction Recently, there has been interest in developing formal models of teaching [4, 10, ...
Combinatorial Variability of VapnikChervonenkis Classes with Applications to Sample Compression Schemes
 Discrete Applied Mathematics
, 1998
"... We define embeddings between concept classes that are meant to reflect certain aspects of their combinatorial structure. Furthermore, we introduce a notion of universal concept classes  classes into which any member of a given family of classes can be embedded. These universal classes play a role ..."
Abstract

Cited by 22 (0 self)
 Add to MetaCart
We define embeddings between concept classes that are meant to reflect certain aspects of their combinatorial structure. Furthermore, we introduce a notion of universal concept classes  classes into which any member of a given family of classes can be embedded. These universal classes play a role similar to that played in computational complexity by languages that are hard for a given complexity class. We show that classes of halfspaces in IR n are universal with respect to families of algebraically defined classes. We present some combinatorial parameters along which the family of classes of a given VCdimension can be grouped into subfamilies. We use these parameters to investigate the existence of embeddings and the scope of universality of classes. We view the formulation of these parameters and the related questions that they raise as a significant component in this work. A second theme in our work is the notion of Sample Compression Schemes. Intuitively, a class C has a sample compression scheme if for any finite sample, labeled according to a member of C, there exists a short subsample so that the labels of the full sample can be reconstructed from this subsample. By demonstrating the existence of certain compression schemes for the classes of halfspaces the existence of similar compression schemes for every class embeddable in halfspaces readily follows. We apply this approach to prove existence of compression schemes for all `geometric concept classes'.
Unlabeled Compression Schemes for Maximum Classes
 J. Machine Learning Research,vol
, 2007
"... Abstract. We give a compression scheme for any maximum class of VC dimension d that compresses any sample consistent with a concept in the class to at most d unlabeled points from the domain of the sample. 1 ..."
Abstract

Cited by 18 (0 self)
 Add to MetaCart
(Show Context)
Abstract. We give a compression scheme for any maximum class of VC dimension d that compresses any sample consistent with a concept in the class to at most d unlabeled points from the domain of the sample. 1
On the Impact of Forgetting on Learning Machines
 Journal of the ACM
, 1993
"... this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that ..."
Abstract

Cited by 15 (5 self)
 Add to MetaCart
(Show Context)
this paper contributes toward the goal of understanding how a computer can be programmed to learn by isolating features of incremental learning algorithms that theoretically enhance their learning potential. In particular, we examine the effects of imposing a limit on the amount of information that learning algorithm can hold in its memory as it attempts to This work was facilitated by an international agreement under NSF Grant 9119540.
The Power of SelfDirected Learning
 Machine Learning
, 1991
"... This paper studies selfdirected learning, a variant of the online learning model in which the learner selects the presentation order for the instances. We give tight bounds on the complexity of selfdirected learning for the concept classes of monomials, kterm DNF formulas, and orthogonal rectan ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
This paper studies selfdirected learning, a variant of the online learning model in which the learner selects the presentation order for the instances. We give tight bounds on the complexity of selfdirected learning for the concept classes of monomials, kterm DNF formulas, and orthogonal rectangles in f0; 1; \Delta \Delta \Delta ; n \Gamma 1g d . These results demonstrate that the number of mistakes under selfdirected learning can be surprisingly small. We then prove that the model of selfdirected learning is more powerful than all other commonly used online and query learning models. Next we explore the relationship between the complexity of selfdirected learning and the VapnikChervonenkis dimension. Finally, we explore a relationship between Mitchell's version space algorithm and the existence of selfdirected learning algorithms that make few mistakes. Supported in part by a GE Foundation Junior Faculty Grant and NSF Grant CCR9110108. Part of this research was conduct...
Shifting: OneInclusion Mistake Bounds and Sample Compression
 EECS DEPARTMENT, UNIVERSITY OF CALIFORNIA, BERKELEY
, 2007
"... ..."
(Show Context)
A geometric approach to sample compression
"... The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that c ..."
Abstract

Cited by 10 (1 self)
 Add to MetaCart
(Show Context)
The Sample Compression Conjecture of Littlestone & Warmuth has remained unsolved for a quarter century. While maximum classes (concept classes meeting Sauer’s Lemma with equality) can be compressed, the compression of general concept classes reduces to compressing maximal classes (classes that cannot be expanded without increasing VC dimension). Two promising ways forward are: embedding maximal classes into maximum classes with at most a polynomial increase to VC dimension, and compression via operating on geometric representations. This paper presents positive results on the latter approach and a first negative result on the former, through a systematic investigation of finite maximum classes. Simple arrangements of hyperplanes in hyperbolic space are shown to represent maximum classes, generalizing the corresponding Euclidean result. We show that sweeping a generic hyperplane across such arrangements forms an unlabeled compression scheme of size VC dimension and corresponds to a special case of peeling the oneinclusion graph, resolving a recent conjecture of Kuzmin & Warmuth. A bijection between finite maximum classes and certain arrangements of piecewiselinear (PL) hyperplanes in either a ball or Euclidean space is established. Finally we show that dmaximum classes corresponding to PLhyperplane
A Composition Theorem for Learning Algorithms with Applications to Geometric Concept Classes
 In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC
, 1997
"... This paper solves the open problem of exact learning geometric objects bounded by hyperplanes (and more generally by any constant degree algebraic surfaces) in the constant dimensional space from equivalence queries only (i.e., in the online learning model). We present a novel approach that allows, ..."
Abstract

Cited by 7 (5 self)
 Add to MetaCart
This paper solves the open problem of exact learning geometric objects bounded by hyperplanes (and more generally by any constant degree algebraic surfaces) in the constant dimensional space from equivalence queries only (i.e., in the online learning model). We present a novel approach that allows, under certain conditions, the composition of learning algorithms for simple classes into an algorithm for a more complicated class. Informally speaking, it shows that if a class of concepts C is learnable in time t using a small space then C ? , the class of all functions of the form f(g 1 ; : : : ; g m ) with g 1 ; : : : ; gm 2 C and any f , is learnable in polynomial time in t and m. We then show that the class of halfspaces in a fixed dimension space is learnable with a small space. 1 Introduction Littlestone's online learning model [L88, L89] is one of the major models of learning. Learnability in this model implies learnability in Valiant's PAC model [Val84], and is equivalent to l...