Results 1  10
of
19
The Power of Vacillation in Language Learning
, 1992
"... Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there ..."
Abstract

Cited by 48 (11 self)
 Add to MetaCart
Some extensions are considered of Gold's influential model of language learning by machine from positive data. Studied are criteria of successful learning featuring convergence in the limit to vacillation between several alternative correct grammars. The main theorem of this paper is that there are classes of languages that can be learned if convergence in the limit to up to (n+1) exactly correct grammars is allowed but which cannot be learned if convergence in the limit is to no more than n grammars, where the no more than n grammars can each make finitely many mistakes. This contrasts sharply with results of Barzdin and Podnieks and, later, Case and Smith, for learnability from both positive and negative data. A subset principle from a 1980 paper of Angluin is extended to the vacillatory and other criteria of this paper. This principle, provides a necessary condition for circumventing overgeneralization in learning from positive data. It is applied to prove another theorem to the eff...
Incremental concept learning for bounded data mining
 Information and Computation
, 1999
"... Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning ma ..."
Abstract

Cited by 40 (30 self)
 Add to MetaCart
Important re nements of concept learning in the limit from positive data considerably restricting the accessibility of input data are studied. Let c be any concept; every in nite sequence of elements exhausting c is called positive presentation of c. In all learning models considered the learning machine computes a sequence of hypotheses about the target concept from a positive presentation of it. With iterative learning, the learning machine, in making a conjecture, has access to its previous conjecture and the latest data item coming in. In kbounded examplememory inference (k is a priori xed) the learner is allowed to access, in making a conjecture, its previous hypothesis, its memory of up to k data items it has already seen, and the next element coming in. In the case of kfeedback identi cation, the learning machine, in making a conjecture, has access to its previous conjecture, the latest data item coming in, and, on the basis of this information, it can compute k items and query the database of previous data to nd out, for each of the k items, whether or not it is in the database (k is again a priori xed). In all cases, the sequence of conjectures has to converge to a hypothesis
Synthesizing noisetolerant language learners
 Theoretical Computer Science A
, 1997
"... An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generates a sequence of decision procedures defining the family. F. Stephan’s model of noisy data is employed, in which, roughly, c ..."
Abstract

Cited by 7 (3 self)
 Add to MetaCart
An index for an r.e. class of languages (by definition) generates a sequence of grammars defining the class. An index for an indexed family of languages (by definition) generates a sequence of decision procedures defining the family. F. Stephan’s model of noisy data is employed, in which, roughly, correct data crops up infinitely often, and incorrect data only finitely often. Studied, then, is the synthesis from indices for r.e. classes and for indexed families of languages of various kinds of noisetolerant languagelearners for the corresponding classes or families indexed. Many positive results, as well as some negative results, are presented regarding the existence of such synthesizers. The proofs of most of the positive results yield, as pleasant corollaries, strict subsetprinciple or telltale style characterizations for the noisetolerant learnability of the corresponding classes or families indexed. 1
Learning from Multiple Sources of Inaccurate Data
 in &quot;Proceedings of the International Workshop on Analogical and Inductive Inference in Dagstuhl
, 1992
"... Abstract. Most theoretical models of inductive inference make the idealized assumption that the data available to a learner is from a single and accurate source. The subject of inaccuracies in data emanating from a single source has been addressed by several authors. The present paper argues in favo ..."
Abstract

Cited by 5 (2 self)
 Add to MetaCart
Abstract. Most theoretical models of inductive inference make the idealized assumption that the data available to a learner is from a single and accurate source. The subject of inaccuracies in data emanating from a single source has been addressed by several authors. The present paper argues in favor of a more realistic learning model in which data emanates from multiple sources, some or all of which may be inaccurate. Three kinds of inaccuracies are considered: spurious data (modeled as noisy texts), missing data (modeled as incomplete texts), and a mixture of spurious and missing data (modeled as imperfect texts). Motivated by the above argument, the present paper introduces and theoretically analyzes a number of inference criteria in which a learning machine is fed data from multiple sources, some of which may be infected with inaccuracies. The learning situation modeled is the identification in the limit of programs from graphs of computable functions. The main parameters of the investigation are: kind of inaccuracy, total number of data sources, number of faulty data sources which produce data within an acceptable bound, and the bound on the number of errors allowed in the final hypothesis learned by the machine. Sufficient conditions are determined under which, for the same kind of inaccuracy, for the same
Spatial/Kinematic Domain and Lattice Computers
 JOURNAL OF EXPERIMENTAL AND THEORETICAL ARTIFICIAL INTELLIGENCE
, 1994
"... An approach to analogical representation for objects and their motions in space is proposed. This approach involves lattice computer architectures and associated algorithms and is shown to be abstracted from the behavior of human beings mentally solving spatial /kinematic puzzles. There is also dis ..."
Abstract

Cited by 4 (4 self)
 Add to MetaCart
An approach to analogical representation for objects and their motions in space is proposed. This approach involves lattice computer architectures and associated algorithms and is shown to be abstracted from the behavior of human beings mentally solving spatial /kinematic puzzles. There is also discussion of where in this approach the modeling of human cognition leaves off and the engineering begins. The possible relevance of the approach to a number of issues in Artificial Intelligence is discussed. These issues include efficiency of sentential versus analogical representations, common sense reasoning, update propagation, learning performance tasks, diagrammatic representations, spatial reasoning, metaphor, human categorization, and pattern recognition. Lastly there is a discussion of the somewhat related approach involving cellular automata applied to computational physics.
Strongly nonUshaped learning results by general techniques
 In Proc. of COLT’2010
, 2010
"... In learning, a semantic or behavioral Ushape occurs when a learner rst learns, then unlearns, and, nally, relearns, some target concept (on the way to success). Within the framework of Inductive Inference, previous results have shown, for example, that such Ushapes are unnecessary for explanatory ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In learning, a semantic or behavioral Ushape occurs when a learner rst learns, then unlearns, and, nally, relearns, some target concept (on the way to success). Within the framework of Inductive Inference, previous results have shown, for example, that such Ushapes are unnecessary for explanatory learning, but are necessary for behaviorally correct and nontrivial vacillatory learning. Herein we focus more on syntactic Ushapes. This paper introduces two general techniques and applies them especially to syntactic Ushapes in learning: one technique to show when they are necessary and one to show when they are unnecessary. The technique for the former is very general and applicable to a much wider range of learning criteria. It employs socalled selflearning classes of languages which are shown to characterize completely one criterion learning more than another. We apply these techniques to show that, for setdriven and partially setdriven learning, any kind of Ushapes are unnecessary. Furthermore, we show that Ushapes are not unnecessary in a strong way for iterative learning, contrasting an earlier result by Case and Moelius that semantic Ushapes are unnecessary for iterative learning. 1
Incremental learning with temporary memory
 THEORETICAL COMPUTER SCIENCE
, 2010
"... In the inductive inference framework of learning in the limit, a variation of the bounded example memory (Bem) language learning model is considered. Intuitively, the new model constrains the learner’s memory not only in how much data may be retained, but also in how long that data may be retained. ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
In the inductive inference framework of learning in the limit, a variation of the bounded example memory (Bem) language learning model is considered. Intuitively, the new model constrains the learner’s memory not only in how much data may be retained, but also in how long that data may be retained. More specifically, the model requires that, if a learner commits an example x to memory in some stage of the learning process, then there is some subsequent stage for which x no longer appears in the learner’s memory. This model is called temporary example memory (T em) learning. In some sense, it captures the idea that memories fade. Many interesting results concerning the T emlearning model are presented. For example, there exists a class of languages that can be identified by memorizing k + 1 examples in the T em sense, but that cannot be identified by memorizing k examples in the Bem sense. On the other hand, there exists a class of languages that can be identified by memorizing just 1 example in the Bem sense, but that cannot be identified by memorizing any number of examples in the T em sense. (The proof of this latter result involves an infinitary selfreference argument.) Results are also presented concerning the special cases of: learning indexable classes of languages, and learning (arbitrary) classes of infinite languages.
Parsimony Hierarchies for Inductive Inference
 JOURNAL OF SYMBOLIC LOGIC
, 2004
"... Freivalds defined an acceptable programming system independent criterion for learning programs for functions in which the final programs were required to be both correct and "nearly" minimal size, i.e, within a computable function of being purely minimal size. Kinber showed that this parsi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Freivalds defined an acceptable programming system independent criterion for learning programs for functions in which the final programs were required to be both correct and "nearly" minimal size, i.e, within a computable function of being purely minimal size. Kinber showed that this parsimony requirement on final programs limits learning power. However, in scientific inference, parsimony is considered highly desirable. A limcomputable function is (by definition) one calculable by a total procedure allowed to change its mind finitely many times about its output. Investigated is the possibility of assuaging somewhat the limitation on learning power resulting from requiring parsimonious final programs by use of criteria which require the final, correct programs to be "notsonearly" minimal size, e.g., to be within a limcomputable function of actual minimal size. It is shown that some parsimony in the final program is thereby retained, yet learning power strictly increases. Considered, then, are limcomputable functions as above but for which notations for constructive ordinals are used to bound the number of mind changes allowed regarding the output. This is a variant of an idea introduced by Freivalds and Smith. For this ordinal notation complexity bounded version of limcomputability, the power of the resultant learning criteria form finely graded, infinitely ramifying, infinite hierarchies intermediate between the computable and the limcomputable cases. Some of these hierarchies, for the natural notations determining them, are shown to be optimally tight.
Systems cannot express their own truth.
"... Outline: • Brief history of linguistic selfreference in mathematical logic. • Meaning, achievement & applications of machine selfreference. • Selfmodeling/selfreflection: segue from machine case to the human refective component of consciousness (other aspects of the complex phenomenon of con ..."
Abstract
 Add to MetaCart
Outline: • Brief history of linguistic selfreference in mathematical logic. • Meaning, achievement & applications of machine selfreference. • Selfmodeling/selfreflection: segue from machine case to the human refective component of consciousness (other aspects of the complex phenomenon of consciousness, e.g., awareness and qualia, are not treated). • What use is selfmodeling/reference? Lessons from machine cases. Summary
Abstract
"... Alice and Bob want to know if two strings of length n are almost equal. That is, do they differ on at most a bits? Let 0 ≤ a ≤ n − 1. We show that any deterministic protocol, as well as any errorfree quantum protocol (C ∗ version), for this problem requires at least n − 2 bits of communication. We ..."
Abstract
 Add to MetaCart
Alice and Bob want to know if two strings of length n are almost equal. That is, do they differ on at most a bits? Let 0 ≤ a ≤ n − 1. We show that any deterministic protocol, as well as any errorfree quantum protocol (C ∗ version), for this problem requires at least n − 2 bits of communication. We show the same bounds for the problem of determining if two strings differ in exactly a bits. We also prove a lower bound of n/2 − 1 for errorfree Q ∗ quantum protocols. Our results are obtained by lowerbounding the ranks of the appropriate matrices. 1