Results 1 
5 of
5
The Power of Vacillation in Language Learning
, 1992
"... Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are ..."
Abstract

Cited by 44 (11 self)
 Add to MetaCart
Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are classes of languages that can be learned if convergence in the limit to up to (n+1) exactly correct grammars is allowed but which cannot be learned if convergence in the limit is to no more than n grammars, where the no more than n grammars can each make finitely many mistakes. This contrasts sharply with results of Barzdin and Podnieks and, later, Case and Smith, for learnability from both positive and negative data. A subset principle from a 1980 paper of Angluin is extended to the vacillatory and other criteria of this paper. This principle, provides a necessary condition for circumventing overgeneralization in learning from positive data. It is applied to prove another theorem to the eff...
Machine induction without revolutionary changes in hypothesis size
 Information and Computation
, 1996
"... This paper provides a beginning study of the effects on inductive inference of paradigm shifts whose absence is approximately modeled by various formal approaches to forbidding large changes in the size of programs conjectured. One approach, called severely parsimonious, requires all the programs co ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
This paper provides a beginning study of the effects on inductive inference of paradigm shifts whose absence is approximately modeled by various formal approaches to forbidding large changes in the size of programs conjectured. One approach, called severely parsimonious, requires all the programs conjectured on the way to success to be nearly (i.e., within a recursive function of) minimal size. It is shown that this very conservative constraint allows learning infinite classes of functions, but not infinite r.e. classes of functions. Another approach, called nonrevolutionary, requires all conjectures to be nearly the same size as one another. This quite conservative constraint is, nonetheless, shown to permit learning some infinite r.e. classes of functions. Allowing up to one extra bounded size mind change towards a final program learned certainly doesnâ€™t appear revolutionary. However, somewhat surprisingly for scientific (inductive) inference, it is shown that there are classes learnable with the nonrevolutionary constraint (respectively, with severe parsimony), up to (i + 1) mind changes, and no anomalies, which classes cannot be learned with no size constraint, an unbounded, finite number of anomalies in the final program, but with no more than i mind changes. Hence, in some cases, the possibility of one extra mind change is considerably more liberating than removal of very conservative size shift constraints. The proofs of these results are also combinatorially interesting. 1
Parsimony Hierarchies for Inductive Inference
 Journal of Symbolic Logic
"... Freivalds defined an acceptable programming system independent criterion for learning programs for functions in which the final programs were required to be both correct and "nearly" minimal size, i.e, within a computable function of being purely minimal size. Kinber showed that this parsimony requi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Freivalds defined an acceptable programming system independent criterion for learning programs for functions in which the final programs were required to be both correct and "nearly" minimal size, i.e, within a computable function of being purely minimal size. Kinber showed that this parsimony requirement on final programs limits learning power. However, in scientific inference, parsimony is considered highly desirable. A limcomputable function is (by definition) one calculable by a total procedure allowed to change its mind finitely many times about its output. Investigated is the possibility of assuaging somewhat the limitation on learning power resulting from requiring parsimonious final programs by use of criteria which require the final, correct programs to be "notsonearly" minimal size, e.g., to be within a limcomputable function of actual minimal size. It is shown that some parsimony in the final program is thereby retained, yet learning power strictly increases. Considered, then, are limcomputable functions as above but for which notations for constructive ordinals are used to bound the number of mind changes allowed regarding the output. This is a variant of an idea introduced by Freivalds and Smith. For this ordinal notation complexity bounded version of limcomputability, the power of the resultant learning criteria form finely graded, infinitely ramifying, infinite hierarchies intermediate between the computable and the limcomputable cases. Some of these hierarchies, for the natural notations determining them, are shown to be optimally tight.
Learning Languages and Functions by Erasing
 Theoretical Computer Science
, 2000
"... Learning by erasing means the process of eliminating potential hypotheses from further consideration thereby converging to the least hypothesis never eliminated. This hypothesis must be a solution to the actual learning problem. The capabilities of learning by erasing are investigated in relation to ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Learning by erasing means the process of eliminating potential hypotheses from further consideration thereby converging to the least hypothesis never eliminated. This hypothesis must be a solution to the actual learning problem. The capabilities of learning by erasing are investigated in relation to two factors: the choice of the overall hypothesis space itself and what sets of hypotheses must or may be erased. These learning capabilities are studied for two fundamental kinds of objects to be learned, namely languages and functions. For learning languages by erasing, the case of learning indexed families is investigated. A complete picture of all separations and coincidences of the considered models is derived. Learning by erasing is compared with standard models of language learning such as learning in the limit, finite learning and conservative learning. The exact location of these types within the hierarchy of the models of learning by erasing is established. Necessary and sufficient conditions for language learning by erasing are presented. For learning functions by erasing, mainly the case of learning minimal programs is studied. Various relationships and differences between the considered types of function learning by erasing and also to standard function learning are exhibited. In particular, these types are explored in Kolmogorov numberings that can be viewed as natural Godel numberings of the partial recursive functions. Necessary and sufficient conditions for function learning by erasing are derived.