## The Power of Self-Directed Learning (1991)

Venue: | Machine Learning |

Citations: | 12 - 1 self |

### BibTeX

@INPROCEEDINGS{Goldman91thepower,

author = {Sally A. Goldman and Robert H. Sloan},

title = {The Power of Self-Directed Learning},

booktitle = {Machine Learning},

year = {1991},

pages = {271--294}

}

### OpenURL

### Abstract

This paper studies self-directed learning, a variant of the on-line learning model in which the learner selects the presentation order for the instances. We give tight bounds on the complexity of self-directed learning for the concept classes of monomials, k-term DNF formulas, and orthogonal rectangles in f0; 1; \Delta \Delta \Delta ; n \Gamma 1g d . These results demonstrate that the number of mistakes under self-directed learning can be surprisingly small. We then prove that the model of self-directed learning is more powerful than all other commonly used on-line and query learning models. Next we explore the relationship between the complexity of self-directed learning and the Vapnik-Chervonenkis dimension. Finally, we explore a relationship between Mitchell's version space algorithm and the existence of self-directed learning algorithms that make few mistakes. Supported in part by a GE Foundation Junior Faculty Grant and NSF Grant CCR-9110108. Part of this research was conduct...

### Citations

1695 | A Theory of the Learnable
- Valiant
- 1984
(Show Context)
Citation Context ...] have shown that this combinatorial measure of a concept class characterizes the number of examples required for learning any concept in the class under the distribution-free or PAC model of Valiant =-=[17]-=-. Related to the VC-dimension are the notions of maximal and maximum concept classes [4, 19]. A concept class is maximal if adding any concept to the class increases the VC dimension of the class. Def... |

946 |
On the uniform convergence of relative frequencies of events to their probabilities. Theory Probab
- Vapnik, Chervonenkis
- 1971
(Show Context)
Citation Context ...a concept class C (sdc(C)) as follows: sdc(C) = max c2C fopt(c)g where opt(c) is the optimal mistake bound for learning c under self-directed learning. We now define the Vapnik-Chervonenkis dimension =-=[18]. Let-=- X be any instance space, and C be a concept class over X. A finite set Y ` X is shattered by C if fc"Y j c 2 Cg = 2 Y . In other words, Y ` X is shattered by C if for each subset Y 0 ` Y , there... |

647 |
Queries and concept learning
- Angluin
- 1988
(Show Context)
Citation Context ...ry finite subset Y ` X, the class C, when restricted to be a class over Y , contains \Phi d (jY j) concepts. 5 Finally, we describe membership and equivalence queries as originally defined by Angluin =-=[1]-=-. ffl A membership query is a call to an oracle that on input x for any x 2 X classifies x as either a positive or negative instance according to the target concept cs2 C. ffl An equivalence query is ... |

624 |
Learnability and the vapnik-chervonenkis dimension
- Blumer, Ehrenfeucht, et al.
- 1989
(Show Context)
Citation Context ..., but none of the instances in Y \Gamma Y 0 . The Vapnik-Chervonenkis dimension of C, denoted vcd(C), is defined to be the smallest d for which no set of d + 1 points is shattered by C. Blumer et al. =-=[2]-=- have shown that this combinatorial measure of a concept class characterizes the number of examples required for learning any concept in the class under the distribution-free or PAC model of Valiant [... |

237 |
On the density of families of sets
- Sauer
- 1972
(Show Context)
Citation Context ...f the class. Define \Phi d (m) = 8 ! : P d i=0 i m i j for msd 2 m for m ! d. If C is a concept class of VC-dimension d on a finite set X with jXj = m, then the cardinality of C is at most \Phi d (m) =-=[16, 18]-=-. A concept class C over X is maximum if for every finite subset Y ` X, the class C, when restricted to be a class over Y , contains \Phi d (jY j) concepts. 5 Finally, we describe membership and equiv... |

223 |
Quantifying inductive bias: AI learning algorithms and Valiantâ€™s learning framework
- Haussler
- 1988
(Show Context)
Citation Context ...is as follows. Initially let G contain only the concept containing all instances and let S contain only the empty concept. Then, for each example, both G and S are appropriately updated. See Haussler =-=[6, 7]-=-, and also Rivest and Sloan [15] for a discussion of connections between Mitchell's version space algorithm and the distribution-free learning model. We now describe a relation between the version spa... |

129 |
Version spaces: A candidate elimination approach to rule learning
- Mitchell
- 1977
(Show Context)
Citation Context ...amily of concept classes for which the complexity of self-directed learning is larger than the VC-dimension. Finally, in Section 8 we explore a relationship between Mitchell's version space algorithm =-=[14]-=- and the existence of self-directed learning algorithms that make few mistakes. 2 Motivation In this section we motivate the self-directed learning model by reviewing the allergist example given by Go... |

107 |
Learning when irrelevant attributes abound: a new linear-threshold algorithm
- Littlestone
- 1988
(Show Context)
Citation Context ...ally, a hypothesis h for C is a rule that given any x 2 X outputs in polynomial time a prediction for c(x). We are now ready to define the absolute mistake-bound variant of the on-line learning model =-=[8, 11]-=-. An on-line algorithm (or prediction algorithm) for C is an algorithm that runs under the following scenario. A learning session consists of a set of trials. In each trial, an adversary 1 presents th... |

77 |
Learning conjunctive concepts in structural domains
- Haussler
- 1989
(Show Context)
Citation Context ...is as follows. Initially let G contain only the concept containing all instances and let S contain only the empty concept. Then, for each example, both G and S are appropriately updated. See Haussler =-=[6, 7]-=-, and also Rivest and Sloan [15] for a discussion of connections between Mitchell's version space algorithm and the distribution-free learning model. We now describe a relation between the version spa... |

63 |
Learning integer lattices
- Helmbold, Sloan, et al.
- 1992
(Show Context)
Citation Context ... concept class of multiples be defined as follows. Let the instance space be the natural numbers, and let the concept class be all multiples of i for each i 2 N . This class has infinite VC-dimension =-=[9]-=-. 3 Nevertheless, we now show that sdc(multiples) = 1. The algorithm for obtaining this mistake bound is shown in Figure 7. Clearly this procedure finds the target concept while making only a single m... |

50 |
Predicting f0; 1g functions on randomly drawn points
- Haussler, Littlestone, et al.
- 1994
(Show Context)
Citation Context ...ally, a hypothesis h for C is a rule that given any x 2 X outputs in polynomial time a prediction for c(x). We are now ready to define the absolute mistake-bound variant of the on-line learning model =-=[8, 11]-=-. An on-line algorithm (or prediction algorithm) for C is an algorithm that runs under the following scenario. A learning session consists of a set of trials. In each trial, an adversary 1 presents th... |

40 |
M.K.: Learning nested differences of intersection-closed concept classes
- Helmbold, Sloan, et al.
- 1990
(Show Context)
Citation Context ...y that c is the unique most specific concept consistent with the instances in I. (This generalizes the notation of a spanning set of an intersection-closed class given by Helmbold, Sloan, and Warmuth =-=[10]-=-.) Finally we define I(C) for concept class C as follows: I(C) = max c2C fjI c j : I c is a minimal spanning set for c with respect to Cg: To provide some intuition for our more general result, we fir... |

34 | Learning binary relations and total orders
- Goldman, Rivest, et al.
- 1989
(Show Context)
Citation Context .... In the next section, we motivate the self-directed learning model. Then in Section 3 we formally describe the model of self-directed learning (as originally defined by Goldman, Rivest, and Schapire =-=[5]-=-). Next in Section 4, we discuss some related work. In Section 5 we give tight bounds on the complexity of self-directed learning for the concept classes of monomials, k-term DNF formulas, and orthogo... |

33 |
Learning complicated concepts reliably and usefully
- Rivest, Sloan
- 1988
(Show Context)
Citation Context ...tain only the concept containing all instances and let S contain only the empty concept. Then, for each example, both G and S are appropriately updated. See Haussler [6, 7], and also Rivest and Sloan =-=[15]-=- for a discussion of connections between Mitchell's version space algorithm and the distribution-free learning model. We now describe a relation between the version space algorithm and situations in w... |

30 |
On the complexity of learning from counterexamples
- Turan
- 1989
(Show Context)
Citation Context ...learning is more powerful than all other commonly used on-line and query learning models. In particular, we show that this model is more powerful than all of the models considered by Maass and Tur'an =-=[12, 13]-=-. Next in Section 7 we study the relationship between the optimal mistake bound under self-directed learning and the Vapnik-Chervonenkis dimension. We first show that the VCdimension can be arbitraril... |

25 |
An analytical comparison of some rule-learning programs
- Bundy, Silver, et al.
- 1985
(Show Context)
Citation Context ...most general rule most specific rule Figure 11: The rule space. with any sample. For example, the class of monotone monomials 5 has the property that the set S never contains more than one hypothesis =-=[3]-=-. Furthermore, for any monomial c, any minimal spanning set I c is just the single instance for which which all the variables in c are 1 and the rest are 0. Thus for the class of monotone monomials, I... |

18 |
Space-bounded learning and the Vapnik-Chervonenkis dimension
- Floyd
- 1989
(Show Context)
Citation Context ...of examples required for learning any concept in the class under the distribution-free or PAC model of Valiant [17]. Related to the VC-dimension are the notions of maximal and maximum concept classes =-=[4, 19]-=-. A concept class is maximal if adding any concept to the class increases the VC dimension of the class. Define \Phi d (m) = 8 ! : P d i=0 i m i j for msd 2 m for m ! d. If C is a concept class of VC-... |

17 | On the complexity of learning from counterexamples and membership queries - Maass, Turfin - 1990 |

3 |
Complete range spaces. Unpublished manuscript
- Welzl
- 1987
(Show Context)
Citation Context ...of examples required for learning any concept in the class under the distribution-free or PAC model of Valiant [17]. Related to the VC-dimension are the notions of maximal and maximum concept classes =-=[4, 19]-=-. A concept class is maximal if adding any concept to the class increases the VC dimension of the class. Define \Phi d (m) = 8 ! : P d i=0 i m i j for msd 2 m for m ! d. If C is a concept class of VC-... |