Abstract:
this paper I will attempt to survey some of the results and intuitions developed in the area of computational learning theory. My focus will be on two issues in particular: that some examples may be more relevant than others, and that within an example, some features may be more relevant than others. This survey is by no means even close to comprehensive, and strongly reflects my own personal biases as well as issues brought up by results presented at this workshop. Issues of relevance are fundamental in the theoretical study of machine learning. In particular, questions regarding the meaning of a "relevant" or "informative" example are key motivations for the most popular and most basic theoretical models. Let me begin in the traditional manner of defining the basic models discussed, but do so from the point of view of the motivations from "relevance."
Citations
|
1328
|
A theory of the learnable
– Valiant
- 1984
|
|
525
|
Learnability and the Vapnik-Chervonenkis dimension
– Blumer, Ehrenfeucht, et al.
- 1989
|
|
499
|
Learning quickly when irrelevant attributes abound: A new linearthreshold algorithm
– Littlestone
- 1988
|
|
471
|
Approximation algorithms for combinatorial problems. Proceedings of the fifth annual ACM symposium on Theory of computing 38–49
– Johnson
- 1973
|
|
438
|
The weighted majority algorithm
– Littlestone, Warmuth
- 1994
|
|
378
|
Adaptive Switching Circuits
– Widrow, Hoff
- 1960
|
|
365
|
Learning Regular Sets from Queries and Counterexamples
– Angluin
- 1987
|
|
297
|
On the hardness of approximating minimization problems
– Lund, Yannakakis
- 1994
|
|
228
|
How to use expert advice
– Cesa-Bianchi, Freund, et al.
- 1997
|
|
180
|
Aggregating strategies
– Vovk
- 1990
|
|
173
|
Approximate Counting, Uniform Generation and Rapidly Mixing Markov Chains
– Sinclair, Jerrum
- 1989
|
|
140
|
On the learnability of Boolean formulae
– Kearns, Li, et al.
- 1987
|
|
127
|
An efficient membership-query algorithm for learning DNF with respect to the uniform distribution
– Jackson
- 1997
|
|
115
|
A random polynomial time algorithm for approximating the volume of convex bodies
– Dyer, Frieze, et al.
- 1991
|
|
94
|
Learning read-once formulas with queries
– Angluin, Hellerstein, et al.
- 1993
|
|
46
|
Learning in the presence of finitely or infinitely many irrelevant alternatives
– Blum, Hellerstein, et al.
- 1995
|
|
45
|
An improved boosting algorithm and its implications on learning complexity
– Freund
- 1992
|
|
44
|
Cryptographic primitives based on hard learning problems
– Blum, Furst, et al.
- 1993
|
|
37
|
On the necessity of occam algorithms
– Board, Pitt
- 1990
|
|
34
|
Feature subset selection as search with probabilistic estimates
– Kohavi
- 1994
|
|
27
|
Learning 2� DNF formulas and k� decision trees
– HANCOCK
- 1991
|
|
27
|
On the randomized complexity of volume and diameter
– Lov'asz, Simonovits
- 1992
|
|
17
|
Learning an intersection of k halfspaces over a uniform distribution
– Blum, Kannan
- 1993
|
|
15
|
PAC learning with irrelevant attributes
– Dhagat, Hellerstein
- 1994
|
|
9
|
Quantifying the inductive bias in concept learning
– Haussler
- 1986
|
|
3
|
Learning DNF under the uniform distribution in polynomial time
– Verbeurgt
- 1990
|