Results 1 - 10
of
242
Instance-based learning algorithms
- Machine Learning
, 1991
"... Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to ..."
Abstract
-
Cited by 897 (18 self)
- Add to MetaCart
Abstract. Storing and using specific instances improves the performance of several supervised learning algorithms. These include algorithms that learn decision trees, classification rules, and distributed networks. However, no investigation has analyzed algorithms that use only specific instances to solve incremental learning tasks. In this paper, we describe a framework and methodology, called instance-based learning, that generates classification predictions using only specific instances. Instance-based learning algorithms do not maintain a set of abstractions derived from specific instances. This approach extends the nearest neighbor algorithm, which has large storage requirements. We describe how storage requirements can be significantly reduced with, at most, minor sacrifices in learning rate and classification accuracy. While the storage-reducing algorithm performs well on several realworld databases, its performance degrades rapidly with the level of attribute noise in training instances. Therefore, we extended it with a significance test to distinguish noisy instances. This extended algorithm's performance degrades gracefully with increasing noise levels and compares favorably with a noise-tolerant decision tree algorithm.
Attention, similarity, and the identification-Categorization Relationship
, 1986
"... A unified quantitative approach to modeling subjects ' identification and categorization of multidimensional perceptual stimuli is proposed and tested. Two subjects identified and categorized the same set of perceptually confusable stimuli varying on separable dimensions. The identification data wer ..."
Abstract
-
Cited by 299 (25 self)
- Add to MetaCart
A unified quantitative approach to modeling subjects ' identification and categorization of multidimensional perceptual stimuli is proposed and tested. Two subjects identified and categorized the same set of perceptually confusable stimuli varying on separable dimensions. The identification data were modeled using Sbepard's (1957) multidimensional scaling-choice framework. This framework was then extended to model the subjects ' categorization performance. The categorization model, which generalizes the context theory of classification developed by Medin and Schaffer (1978), assumes that subjects store category exemplars in memory. Classification decisions are based on the similarity of stimuli to the stored exemplars. It is assumed that the same multidimensional perceptual representation underlies performance in both the identification and Categorization paradigms. However, because of the influence of selective attention, similarity relationships change systematically across the two paradigms. Some support was gained for the hypothesis that subjects distribute attention among component dimensions so as to optimize categorization performance. Evidence was also obtained that subjects may have augmented their category representations with inferred exemplars. Implications of the results for theories of multidimensional scaling and categorization are discussed.
Transfer of Cognitive Skill
, 1989
"... A framework for skill acquisition is proposed that includes two major stages in the development of a cognitive skill: a declarative stage in which facts about the skill domain are interpreted and a procedural stage in which the domain knowledge is directly embodied in procedures for performing the s ..."
Abstract
-
Cited by 293 (10 self)
- Add to MetaCart
A framework for skill acquisition is proposed that includes two major stages in the development of a cognitive skill: a declarative stage in which facts about the skill domain are interpreted and a procedural stage in which the domain knowledge is directly embodied in procedures for performing the skill. This general framework has been instantiated in the ACT system in which facts are encoded in a propositional network and procedures are encoded as productions. Knowledge compilation is the process by which the skill transits from the declarative stage to the procedural stage. It consists of the subprocesses of composition, which collapses sequences of productions into single productions, and proceduralization, which embeds factual knowledge into productions. Once proceduralized, further learning processes operate on the skill to make the productions more selective in their range of applications. These processes include generalization, discrimination, and strengthening of productions. Comparisons are made to similar concepts from past learning theories. How these learning mechanisms apply to produce the power law speedup in processing time with practice is discussed. It requires at least 100 hours of learning and practice to acquire any significant cognitive skill to a reasonable degree of proficiency. For instance, after 100 hours a student learning to program a computer has achieved only a very modest facility in the skill. Learning one's primary language takes tens of thousands of hours. The psychology of human learning has been very thin in ideas about what happens to skills under the impact of this amount of learning—and for obvious reasons. This article presents a theory about the changes in the nature of a skill over such large time scales and about the basic learning processes that are responsible.
Learning and development in neural networks: The importance of starting small
- Cognition
, 1993
"... It is a striking fact that in humans the greatest learnmg occurs precisely at that point in time- childhood- when the most dramatic maturational changes also occur. This report describes possible synergistic interactions between maturational change and the ability to learn a complex domain (language ..."
Abstract
-
Cited by 290 (12 self)
- Add to MetaCart
It is a striking fact that in humans the greatest learnmg occurs precisely at that point in time- childhood- when the most dramatic maturational changes also occur. This report describes possible synergistic interactions between maturational change and the ability to learn a complex domain (language), as investigated in con-nectionist networks. The networks are trained to process complex sentences involving relative clauses, number agreement, and several types of verb argument structure. Training fails in the case of networks which are fully formed and ‘adultlike ’ in their capacity. Training succeeds only when networks begin with limited working memory and gradually ‘mature ’ to the adult state. This result suggests that rather than being a limitation, developmental restrictions on resources may constitute a necessary prerequisite for mastering certain complex domains. Specifically, successful learning may depend on starting small.
An Introduction to Machine Translation
, 1992
"... Abstract. In the last ten years there has been a significant amount of research in Machine Translation within a “new ” paradigm of empirical approaches, often labelled collectively as “Example-based” approaches. The first manifestation of this approach caused some surprise and hostility among observ ..."
Abstract
-
Cited by 276 (7 self)
- Add to MetaCart
Abstract. In the last ten years there has been a significant amount of research in Machine Translation within a “new ” paradigm of empirical approaches, often labelled collectively as “Example-based” approaches. The first manifestation of this approach caused some surprise and hostility among observers more used to different ways of working, but the techniques were quickly adopted and adapted by many researchers, often creating hybrid systems. This paper reviews the various research efforts within this paradigm reported to date, and attempts a categorisation of different manifestations of the general approach.
A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features
- Machine Learning
, 1993
"... In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of t ..."
Abstract
-
Cited by 249 (3 self)
- Add to MetaCart
In the past, nearest neighbor algorithms for learning from examples have worked best in domains in which all features had numeric values. In such domains, the examples can be treated as points and distance metrics can use standard definitions. In symbolic domains, a more sophisticated treatment of the feature space is required. We introduce a nearest neighbor algorithm for learning in domains with symbolic features. Our algorithm calculates distance tables that allow it to produce real-valued distances between instances, and attaches weights to the instances to further modify the structure of feature space. We show that this technique produces excellent classification accuracy on three problems that have been studied by machine learning researchers: predicting protein secondary structure, identifying DNA promoter sequences, and pronouncing English text. Direct experimental comparisons with the other learning algorithms show that our nearest neighbor algorithm is comparable or superior ...
Toward an instance theory of automatization
- Psychological Review
, 1988
"... This article presents a theory in which automatization is construed as the acquisition of a domain-specific knowledge base, formed of separate representations, instances, of each exposure to the task. Processing is considered automatic if it relies on retrieval of stored instances, which will occur ..."
Abstract
-
Cited by 223 (1 self)
- Add to MetaCart
This article presents a theory in which automatization is construed as the acquisition of a domain-specific knowledge base, formed of separate representations, instances, of each exposure to the task. Processing is considered automatic if it relies on retrieval of stored instances, which will occur only after practice in a consistent environment. Practice is important because it increases the amount retrieved and the speed of retrieval; consistency is important because it ensures that the retrieved instances will be useful. The theory accounts quantitatively for the power-function speed-up and predicts a power-function reduction in the standard deviation that is constrained to have the same exponent as the power function for the speed-up. The theory accounts for qualitative properties as well, explaining how some may disappear and others appear with practice. More generally, it provides an alternative to the modal view of automaticity, arguing that novice performance is limited by a lack of knowledge rather than a scarcity of resources. The focus on learning avoids many problems with the modal view that stem from its focus on resource limitations. Automaticity is an important phenomenon in everyday men-tal life. Most of us recognize that we perform routine activities quickly and effortlessly, with little thought and conscious aware-ness--in short, automatically (James, 1890). As a result, we of-ten perform those activities on "automatic pilot " and turn our minds to other things. For example, we can drive to dinner while conversing in depth with a visiting scholar, or we can make coffee while planning dessert. However, these benefits may be offset by costs. The automatic pilot can lead us astray, caus-ing errors and sometimes catastrophes (Reason & Myceilska, 1982). If the conversation is deep enough, we may find ourselves and the scholar arriving at the office rather than the restaurant, or we may discover that we aren't sure whether we put two or three scoops of coffee into the pot. Automaticity is also an important phenomenon in skill acqui-sition (e.g., Bryan & Harter, 1899). Skills are thought to consist largely of collections of automatic processes and procedures
MAC/FAC: A Model of Similarity-based Retrieval
- Cognitive Science
, 1991
"... We present a model of similarity-based retrieval which attempts to capture three psychological phenomena: (1) people are extremely good at judging similarity and analogy when given items to compare. (2) Superficial remindings are much more frequent than structural remindings. (3) People sometimes ex ..."
Abstract
-
Cited by 217 (49 self)
- Add to MetaCart
We present a model of similarity-based retrieval which attempts to capture three psychological phenomena: (1) people are extremely good at judging similarity and analogy when given items to compare. (2) Superficial remindings are much more frequent than structural remindings. (3) People sometimes experience and use purely structural analogical remindings. Our model, called MAC/FAC (for "many are called but few are chosen") consists of two stages. The first stage (MAC) uses a computationally cheap, non-structural matcher to filter candidates from a pool of memory items. That is, we redundantly encode structured representations as content vectors, whose dot product yields an estimate of how well the corresponding structural representations will match. The second stage (FAC) uses SME to compute a true structural match between the probe and output from the first stage. MAC/FAC has been fully implemented, and we show that it is capable of modeling patterns of access found in psychological ...
The adaptive nature of human categorization
- Psychological Review
, 1991
"... A rational model of human categorization behavior is presented that assumes that categorization reflects the derivation of optimal estimates of the probability of unseen features of objects. A Bayesian analysis is performed of what optimal estimations would be if categories formed a disjoint partiti ..."
Abstract
-
Cited by 159 (2 self)
- Add to MetaCart
A rational model of human categorization behavior is presented that assumes that categorization reflects the derivation of optimal estimates of the probability of unseen features of objects. A Bayesian analysis is performed of what optimal estimations would be if categories formed a disjoint partitioning of the object space and if features were independently displayed within a category. This Bayesian analysis is placed within an incremental categorization algorithm. The resulting rational model accounts for effects of central tendency of categories, effects of specific instances, learning of linearly nonseparable categories, effects of category labels, extraction of basic level categories, base-rate effects, probability matching in categorization, and trial-by-trial learning functions. Al-though the rational model considers just I level of categorization, it is shown how predictions can be enhanced by considering higher and lower levels. Considering prediction at the lower, individual level allows integration of this rational analysis of categorization with the earlier rational analysis of memory (Anderson & Milson, 1989). Anderson (1990) presented a rational analysis ot 6 human cog-nition. The term rational derives from similar "rational-man" analyses in economics. Rational analyses in other fields are sometimes called adaptationist analyses. Basically, they are ef-forts to explain the behavior in some domain on the assump-tion that the behavior is optimized with respect to some criteria of adaptive importance. This article begins with a general char-acterization ofhow one develops a rational theory of a particu-lar cognitive phenomenon. Then I present the basic theory of categorization developed in Anderson (1990) and review the applications from that book. Since the writing of the book, the theory has been greatly extended and applied to many new phenomena. Most of this article describes these new develop-ments and applications. A Rational Analysis Several theorists have promoted the idea that psychologists might understand human behavior by assuming it is adapted to the environment (e.g., Brunswik, 1956; Campbell, 1974; Gib-

