• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

The Power of Vacillation in Language Learning (1992)

by John Case
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 29
Next 10 →

Biometric identification

by Sanjay Jain, Arun Sharma - Communications of the ACM , 2000
"... Identification of grammars (r. e. indices) for recursively enumerable languages from positive data by algorithmic devices is a well studied problem in learning theory. The present paper considers identification of r. e. languages by machines that have access to membership oracles for noncomputable s ..."
Abstract - Cited by 45 (4 self) - Add to MetaCart
Identification of grammars (r. e. indices) for recursively enumerable languages from positive data by algorithmic devices is a well studied problem in learning theory. The present paper considers identification of r. e. languages by machines that have access to membership oracles for noncomputable sets. It is shown that for any set A there exists another set B such that the collections of r. e. languages that can be identified by machines with access to a membership oracle for B is strictly larger than the collections of r. e. languages that can be identified by machines with access to a membership oracle for A. In other words, there is no maximal inference degree for language identification.

Incremental concept learning for bounded data mining

by Sanjay Jain, Steffen Lange, Thomas Zeugmann - Information and Computation , 1999
"... Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning ma ..."
Abstract - Cited by 41 (29 self) - Add to MetaCart
Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning machine computes a sequence of hypotheses about the target concept from a positive presentation of it. With iterative learning, the learning machine, in making a conjecture, has access to its previous conjecture and the latest data item coming in. In k-bounded example-memory inference (k is a priori xed) the learner is allowed to access, in making a conjecture, its previous hypothesis, its memory of up to k data items it has already seen, and the next element coming in. In the case of k-feedback identi cation, the learning machine, in making a conjecture, has access to its previous conjecture, the latest data item coming in, and, on the basis of this information, it can compute k items and query the database of previous data to nd out, for each of the k items, whether or not it is in the database (k is again a priori xed). In all cases, the sequence of conjectures has to converge to a hypothesis

The intrinsic complexity of language identification

by Sanjay Jain, Arun Sharma - Journal of Computer and System Sciences , 1996
"... A new investigation of the complexity of language identification is undertaken using the notion of reduction from recursion theory and complexity theory. The approach, referred to as the intrinsic complexity of language identification, employs notions of ‘weak ’ and ‘strong ’ reduction between learn ..."
Abstract - Cited by 17 (7 self) - Add to MetaCart
A new investigation of the complexity of language identification is undertaken using the notion of reduction from recursion theory and complexity theory. The approach, referred to as the intrinsic complexity of language identification, employs notions of ‘weak ’ and ‘strong ’ reduction between learnable classes of languages. The intrinsic complexity of several classes is considered and the results agree with the intuitive difficulty of learning these classes. Several complete classes are shown for both the reductions and it is also established that the weak and strong reductions are distinct. An interesting result is that the self referential class of Wiehagen in which the minimal element of every language is a grammar for the language and the class of pattern languages introduced by Angluin are equivalent in the strong sense. This study has been influenced by a similar treatment of function identification by Freivalds, Kinber, and Smith. 1

Infinitary Self Reference in Learning Theory

by John Case , 1994
"... Kleene's Second Recursion Theorem provides a means for transforming any program p into a program e(p) which first creates a quiescent self copy and then runs p on that self copy together with any externally given input. e(p), in effect, has complete (low level) self knowledge, and p represents how ..."
Abstract - Cited by 17 (6 self) - Add to MetaCart
Kleene's Second Recursion Theorem provides a means for transforming any program p into a program e(p) which first creates a quiescent self copy and then runs p on that self copy together with any externally given input. e(p), in effect, has complete (low level) self knowledge, and p represents how e(p) uses its self knowledge (and its knowledge of the external world). Infinite regress is not required since e(p) creates its self copy outside of itself. One mechanism to achieve this creation is a self replication trick isomorphic to that employed by single-celled organisms. Another is for e(p) to look in a mirror to see which program it is. In 1974 the author published an infinitary generalization of Kleene's theorem which he called the Operator Recursion Theorem. It provides a means for obtaining an (algorithmically) growing collection of programs which, in effect, share a common (also growing) mirror from which they can obtain complete low level models of themselves and the other prog...

Synthesizing Enumeration Techniques For Language Learning

by Ganesh R. Baliga, John Case, Sanjay Jain - In Proceedings of the Ninth Annual Conference on Computational Learning Theory , 1996
"... this paper we assume, without loss of generality, that for all oe ` ø , [M(oe) 6=?] ) [M(ø) 6=?]. ..."
Abstract - Cited by 16 (7 self) - Add to MetaCart
this paper we assume, without loss of generality, that for all oe ` ø , [M(oe) 6=?] ) [M(ø) 6=?].

The synthesis of language learners

by Ganesh R. Baliga, Sanjay Jain - Information and Computation , 1999
"... An index for an r.e. class of languages (by definition) is a procedure which generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) is a procedure which generates a sequence of decision procedures defining the family. Studied is the metaprobl ..."
Abstract - Cited by 12 (0 self) - Add to MetaCart
An index for an r.e. class of languages (by definition) is a procedure which generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) is a procedure which generates a sequence of decision procedures defining the family. Studied is the metaproblem of synthesizing from indices for r.e. classes and for indexed families of languages various kinds of language-learners for the corresponding classes or families indexed. Many positive results, as well as some negative results, are presented regarding the existence of such synthesizers. The negative results essentially provide lower bounds for the positive results. The proofs of some of the positive results yield, as pleasant corollaries, subset-principle or tell-tale style characterizations for the learnability of the corresponding classes or families indexed. For example, the indexed families of recursive languages that can be behaviorally correctly identified from positive data are surprisingly characterized by Angluin’s (1980b) Condition 2 (the subset principle for circumventing overgeneralization). 1

Complexity issues for vacillatory function identification

by Sanjay Jain, Arun Sharma - Information and Computation , 1995
"... It was previously shown by Barzdin and Podnieks that one does not increase the power of learning programs for functions by allowing learning algorithms to converge to a finite set of correct programs instead of requiring them to converge to a single correct program. In this paper we define some new, ..."
Abstract - Cited by 12 (9 self) - Add to MetaCart
It was previously shown by Barzdin and Podnieks that one does not increase the power of learning programs for functions by allowing learning algorithms to converge to a finite set of correct programs instead of requiring them to converge to a single correct program. In this paper we define some new, subtle, but natural concepts of mind change complexity for function learning and show that, if one bounds this complexity for learning algorithms, then, by contrast with Barzdin and Podnieks result, there are interesting and sometimes complicated tradeoffs between these complexity bounds, bounds on the number of final correct programs, and learning power. CR Classification Number: I.2.6 (Learning – Induction). 1

Learning in the presence of inaccurate information

by Mark Fulk, Sanjay Jain - in "Proceedings of the 2nd Annual ACM Conference on Computational Learning Theory , 1989
"... The present paper considers the effects of introducing inaccuracies in a learner’s environ-ment in Gold’s learning model of identification in the limit. Three kinds of inaccuracies are considered: presence of spurious data is modeled as learning from a noisy environment, miss-ing data is modeled as ..."
Abstract - Cited by 9 (3 self) - Add to MetaCart
The present paper considers the effects of introducing inaccuracies in a learner’s environ-ment in Gold’s learning model of identification in the limit. Three kinds of inaccuracies are considered: presence of spurious data is modeled as learning from a noisy environment, miss-ing data is modeled as learning from incomplete environment, and the presence of a mixture of both spurious and missing data is modeled as learning from imperfect environment. Two learning domains are considered, namely, identification of programs from graphs of computable functions and identification of grammars from positive data about recursively enumerable languages. Many hierarchies and tradeoffs resulting from the interplay between the number of errors allowed in the final hypotheses, the number of inaccuracies in the data, the types of inaccuracies, and the type of success criteria are derived. An interesting result is that in the context of function learning, incomplete data is strictly worse for learning than noisy data. 1

Synthesizing noise-tolerant language learners

by Sanjay Jain, Arun Sharma - Theoretical Computer Science A , 1997
"... An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generates a sequence of decision procedures defining the family. F. Stephan’s model of noisy data is employed, in which, roughly, c ..."
Abstract - Cited by 7 (3 self) - Add to MetaCart
An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generates a sequence of decision procedures defining the family. F. Stephan’s model of noisy data is employed, in which, roughly, correct data crops up infinitely often, and incorrect data only finitely often. Studied, then, is the synthesis from indices for r.e. classes and for indexed families of languages of various kinds of noise-tolerant language-learners for the corresponding classes or families indexed. Many positive results, as well as some negative results, are presented regarding the existence of such synthesizers. The proofs of most of the positive results yield, as pleasant corollaries, strict subset-principle or tell-tale style characterizations for the noise-tolerant learnability of the corresponding classes or families indexed. 1

On Aggregating Teams of Learning Machines

by Sanjay Jain, Arun Sharma - Theoretical Computer Science A , 1994
"... The present paper studies the problem of when a team of learning machines can be aggregated into a single learning machine without any loss in learning power. The main results concern aggregation ratios for vacillatory identification of languages from texts. For a positiveinteger n,amachine is said ..."
Abstract - Cited by 7 (4 self) - Add to MetaCart
The present paper studies the problem of when a team of learning machines can be aggregated into a single learning machine without any loss in learning power. The main results concern aggregation ratios for vacillatory identification of languages from texts. For a positiveinteger n,amachine is said to TxtFex n -identify a language L just in case the machine converges to up to n grammars for L on any text for L.For such identification criteria, the aggregation ratio is derived for the n = 2 case. It is shown that the collection of languages that can be TxtFex 2 identified by teams with success ratio greater than 5=6 are the same as those collections of languages that can be TxtFex 2 - identified by a single machine. It is also established that 5=6 is indeed the cut-off point by showing that there are collections of languages that can be TxtFex 2 -identified bya team employing 6 machines, at least 5 of which are required to be successful, but cannot be TxtFex 2 -identified byany single machine. Additionally, aggregation ratios are also derived for finite identification of languages from positive data and for numerous criteria involving language learning from both positive and negative data.
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University