Abstract:
Within the framework of pac-learning, we explore the learnability of concepts from samples using the paradigm of sample compression schemes. A sample compression scheme of size d for a concept class C ` 2 X consists of a compression function and a reconstruction function. The compression function, given a finite sample set consistent with some concept in C, chooses a subset of k examples as the compression set. The reconstruction function, given a compression set of k examples, reconstructs a hypothesis on X . Given a compression set produced by the compression function from a sample of a concept in C, the reconstruction function must be able to reproduce a hypothesis consistent with that sample. We demonstrate that the existence of a fixed-size sample compression scheme for a class C is sufficient to ensure that the class C is learnable. We define maximum and maximal classes of VC dimension d. For every maximum class of VC dimension d, there is a sample compression scheme...
Citations
|
1328
|
A theory of the learnable
– Valiant
- 1984
|
|
679
|
On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications
– Vapnik, Chervonenkis
|
|
624
|
Estimation of Dependences Based on Empirical Data
– Vapnik
- 1982
|
|
525
|
Learnability and the Vapnik-Chervonenkis dimension
– Blumer, Ehrenfeucht, et al.
- 1989
|
|
498
|
Queries and concept learning
– Angluin
- 1988
|
|
457
|
The strength of weak learnability
– Schapire
- 1990
|
|
294
|
Boosting a Weak Learning Algorithm by Majority
– Freund
- 1995
|
|
251
|
Inferring decision trees using the minimum description length principle
– Quinlan, Rivest
- 1989
|
|
228
|
How to use expert advice
– Cesa-Bianchi, Freund, et al.
- 1997
|
|
198
|
ε-nets and simplex range queries
– Haussler, Welzl
- 1987
|
|
185
|
Stochastic complexity and modeling
– Rissanen
- 1986
|
|
176
|
On the density of families of sets
– Sauer
- 1972
|
|
172
|
A general lower bound on the number of examples needed for learning
– Ehrenfeucht, Haussler, et al.
- 1988
|
|
169
|
Computational limitations on learning from examples
– Pitt, Valiant
- 1988
|
|
98
|
Version Spaces: A Candidate Elimination Approach to Rule Learning
– Mitchell
- 1977
|
|
93
|
Learning when irrelevant attributes abound: A new linear-threshold algorithm
– Littlestone
- 1988
|
|
91
|
Mistake Bounds and Logarithmic Linear-threshold Learning Algorithms
– Littlestone
- 1989
|
|
54
|
Learning integer lattices
– Helmbold, Sloan, et al.
- 1992
|
|
50
|
Predicting f0; 1gfunctions on randomly drawn points
– Haussler, Littlestone, et al.
- 1994
|
|
45
|
Occam’s razor
– Blumer, Ehrenfeucht, et al.
- 1987
|
|
44
|
On weak learning
– Helmbold, Warmuth
- 1995
|
|
34
|
Learning nested differences of intersection-closedclasses
– Helmbold, Sloan, et al.
- 1989
|
|
19
|
Bounding sample size with the VapnikChervonenkis dimension
– Shawe-Taylor, Anthony, et al.
- 1993
|
|
19
|
to use expert advice
– How
- 1997
|
|
17
|
Relating data compression and learnability. Unpublished manuscript
– Littlestone, Warmuth
- 1986
|
|
16
|
On Space-bounded Learning and the Vapnik-Chervonenkis Dimension
– Floyd
- 1989
|
|
14
|
Randomized geometric algorithms
– Clarkson
- 1992
|
|
13
|
Space efficient learning algorithms
– Haussler
- 1988
|
|
12
|
The power of self-directed learning
– Goldman, Sloan
- 1994
|
|
7
|
Learning nested differences of intersection closed concept classes
– Helmbold, Sloan, et al.
- 1990
|
|
5
|
Some new bounds for epsilon-nets
– Pach, Woeginger
- 1990
|
|
5
|
Complete range spaces. Unpublished notes
– Welzl
- 1987
|
|
5
|
Relating Data Compression and Learnability", unpublished manuscript
– Littlestone, Warmuth
- 1986
|
|
2
|
Learning faster than promised by the Vapnik-Chervonenkis dimension
– Blumer, Littlestone
- 1989
|