## Toward a unified theory of similarity and recognition (1988)

Venue: | Psychological Review |

Citations: | 81 - 6 self |

### BibTeX

@ARTICLE{Ashby88towarda,

author = {E Gregory Ashby and Nancy A. Perrin},

title = {Toward a unified theory of similarity and recognition},

journal = {Psychological Review},

year = {1988},

volume = {95},

pages = {124--150}

}

### Years of Citing Articles

### OpenURL

### Abstract

A new theory of similarity, rooted in the detection and recognition literatures, is developed. The general recognition theory assumes that the perceptual effect of a stimulus is random but that on any single trial it can be represented as a point in a multidimensional space. Similarity is a function of the overlap of perceptual distributions. It is shown that the general recognition theory contains Euclidean distance models of similarity as a special case but that unlike them, it is not constrained by any distance axioms. Three experiments are reported that test the empirical validity of the theory. In these experiments the general recognition theory accounts for similarity data as well as the cur-rently popular similarity theories do, and it accounts for identification data as well as the long-standing "champion " identification model does. The concept of similarity is of fundamental importance in psychology. Not only is there a vast literature concerned directly with the interpretation of subjective similarity judgments (e.g., as in multidimensional scaling) but the concept also plays a cru-cial but less direct role in the modeling of many psychophysical tasks. This is particularly true in the case of pattern and form recognition. It is frequently assumed that the greater the simi-larity between a pair of stimuli, the more likely one will be con-fused with the other in a recognition task (e.g., Luce, 1963; Shepard, 1964; Tversky & Gati, 1982). Yet despite the poten-tially close relationship between the two, there have been only a few attempts at developing theories that unify the similarity and recognition literatures. Most attempts to link the two have used a distance-based similarity measure to predict the confusions in recognition ex-

### Citations

2779 | Introduction to statistical pattern recognition, 2nd ed - Fukunaga - 1990 |

1087 | Principles of psychology - James - 1890 |

1008 |
Features of Similarity
- Tversky
- 1977
(Show Context)
Citation Context ..., 1986; Shepard, 1957, 1958b). It is now widely suspected, however, that standard distance-based similarity measures do not provide an adequate account of perceived similarity (e.g., Krumhansl, 1978; =-=Tversky, 1977-=-). Our approach takes the opposite tack. We begin with a very powerful and general theory of recognition and use it to derive a new similarity measure, which successfully accounts for a wide variety o... |

493 |
Mental rotation of three-dimensional objects
- Shepard, Metzler
- 1971
(Show Context)
Citation Context ...many common features should be judged more similar than the stimuli differing by a rotation in an experiment requiring speeded responses, because mental rotation requires extra processing time (e.g., =-=Shepard & Metzler, 1971-=-). Although both the general recognition theory and MDS theories postulate similar multidimensional spaces, these two classes of models have very different foundations. In MDS models, similarity is a ... |

477 |
Context theory of classification learning
- Medlin, Schaffer
- 1978
(Show Context)
Citation Context ...et al., 198 I; but see also Ennis & Mullen, 1986; Luce & Galanter, 1963; MacKays& Zinnes, 1981; Suppes & Zinnes, 1963; Zinnes & MacKay, 1983). The second version, proposed by Nosofsky (1986; see also =-=Medin & Schaffer, 1978-=-), assumes that perceived similarity is based on the average similarity between all samples in one stimulus class and all samples in the other (where similarity = exp[-distance]). Analytic predictions... |

452 | Attention, similarity, and the identification-categorization relationship
- Nosofsky
- 1986
(Show Context)
Citation Context ...llow subjects to weight differentially the psychological dimensions, as in the weighted Euclidean model, in a manner that depends on stimulus context (e.g., Carroll & Chang, 1970; Getty et al., 1980; =-=Nosofsky, 1986-=-). In this model, adding a new stimulus to the ensemble causes the subject to readjust the amount of attention focused on each dimension, thereby effectively stretching some dimensions and shrinking o... |

359 |
Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis
- Kruskal
- 1964
(Show Context)
Citation Context ...ultidimensional metric space and assumes that judgments of the perceived similarity of two stimuli are inversely related to the distance between their perceptual representations (e.g., Davison, 1983; =-=Kruskal, 1964-=-a, 1964b; Shopard, 1962a, 1962b; Torgerson, 1958). This class of models, known as the geometric models of similarity, is contained within the larger class of multidimensional scaling (MDS) models. MDS... |

339 |
Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition
- Carroll, Chang
- 1970
(Show Context)
Citation Context ...ake MDS models sensitive to context is to allow subjects to weight differentially the psychological dimensions, as in the weighted Euclidean model, in a manner that depends on stimulus context (e.g., =-=Carroll & Chang, 1970-=-; Getty et al., 1980; Nosofsky, 1986). In this model, adding a new stimulus to the ensemble causes the subject to readjust the amount of attention focused on each dimension, thereby effectively stretc... |

278 |
Theory and Methods of Scaling
- Torgerson
- 1958
(Show Context)
Citation Context ...judgments of the perceived similarity of two stimuli are inversely related to the distance between their perceptual representations (e.g., Davison, 1983; Kruskal, 1964a, 1964b; Shopard, 1962a, 1962b; =-=Torgerson, 1958-=-). This class of models, known as the geometric models of similarity, is contained within the larger class of multidimensional scaling (MDS) models. MDS models assume the same sort of stimulus represe... |

277 |
On the genesis of abstract ideas
- Posner, Keele
- 1970
(Show Context)
Citation Context ...based on the distance between category representations. This model is a straightforward generalization of prototype models of categorization (e.g., Ashby & Gott, 1988; Homa, Sterling, & Trepel, 1981; =-=Posner & Keele, 1968-=-, 1970; Rosch, 1973; Rosch, Simpson, & Miller, 1976), and so we will refer to it as the prototypebased MDS model. Because the stimulus means are identical in all three Experiment 1 conditions, their p... |

253 |
The processing of information and structure
- Garner
- 1974
(Show Context)
Citation Context ...eneral Gaussian recognition. MDS = multidimensional scaling. Values with a subscript a are significant at a = .05. Values with a subscript b are significant at a = .01. metrics (e.g., Attneave, 1950; =-=Garner, 1974-=-; Shepard, 1964; Torgerson, 1958; Tversky & Gati, 1982; for an exception, see Nosofsky, 1987). Ashby and Townsend (1986) presented independent evidence that the stimulus components used by Townsend et... |

225 |
Nonmetric multidimensional scaling: a numerical method
- Kruskal
- 1964
(Show Context)
Citation Context ...ultidimensional metric space and assumes that judgments of the perceived similarity of two stimuli are inversely related to the distance between their perceptual representations (e.g., Davison, 1983; =-=Kruskal, 1964-=-a, 1964b; Shopard, 1962a, 1962b; Torgerson, 1958). This class of models, known as the geometric models of similarity, is contained within the larger class of multidimensional scaling (MDS) models. MDS... |

176 | Decision rules in the perception and categorization of multidimensional stimuli
- Ashby, Gott
- 1988
(Show Context)
Citation Context ...A(x, y) " A response bias occurs if the boundary is set anywhere else. In this case the subject's decision rule is no longer optimal in terms of accuracy, although it may still maximize payoffs. (See =-=Ashby & Gott, 1988-=-, for a more thorough discussion of perceptual decision rules.) The most familiar version of this model assumes the perceptual distributions are multivariate normal. This special case, called the gene... |

159 |
A law of comparative judgment
- Thurstone
- 1927
(Show Context)
Citation Context ...umes the perceptual distributions are multivariate normal. This special case, called the general Gaussian recognition model, is related to the Case I model of Thurstone's law of categorical judgment (=-=Thurstone, 1927-=-; see also Hefner, 1958; Torgerson, 1958; Zinnes & MacKay, 1983), but it can also be viewed as a multidimensional generalization of signal-detection theory (see, e.g., Green & Swets, 1966; Tanner, 195... |

153 |
Choice, similarity, and the context theory of classification
- Nosofsky
- 1984
(Show Context)
Citation Context ...ed a distance-based similarity measure to predict the confusions in recognition experiments (Appelman & Mayzner, 1982; Getty, Swets, & Swets, 1980; Getty, Swets, Swets, & Green, 1979; Nakatani, 1972; =-=Nosofsky, 1984-=-, 1985b, 1986; Shepard, 1957, 1958b). It is now widely suspected, however, that standard distance-based similarity measures do not provide an adequate account of perceived similarity (e.g., Krumhansl,... |

130 |
Detection and recognition
- Luce
- 1963
(Show Context)
Citation Context ...e of pattern and form recognition. It is frequently assumed that the greater the similarity between a pair of stimuli, the more likely one will be confused with the other in a recognition task (e.g., =-=Luce, 1963-=-; Shepard, 1964; Tversky & Gati, 1982). Yet despite the potentially close relationship between the two, there have been only a few attempts at developing theories that unify the similarity and recogni... |

121 |
Varieties of perceptual independence
- Ashby, Townsend
- 1986
(Show Context)
Citation Context ...72) and Carroll and Chang (1972; see also Carroll & Wish, 1974). Their idea (see also Tanner, 1956) was that the degree of perceptual dependence should be related to the angle between dimensions (see =-=Ashby & Townsend, 1986-=-), and so the resulting model, known as the general Euclidean scaling model, allows oblique dimensions and defines the perceived dissimilarity of SA to SB for Subjectj as 4(SA, SB) = [Wxj2(XA -- XB) 2... |

102 |
Stimulus and response generalization: A stochastic model relating generalization to distance in psychological space
- Shepard
- 1957
(Show Context)
Citation Context ...y measure to predict the confusions in recognition experiments (Appelman & Mayzner, 1982; Getty, Swets, & Swets, 1980; Getty, Swets, Swets, & Green, 1979; Nakatani, 1972; Nosofsky, 1984, 1985b, 1986; =-=Shepard, 1957-=-, 1958b). It is now widely suspected, however, that standard distance-based similarity measures do not provide an adequate account of perceived similarity (e.g., Krumhansl, 1978; Tversky, 1977). Our a... |

93 |
Concerning the applicability of geometric models to similarity data: The interrelationship between similarity and spatial density
- Krumhansl
- 1978
(Show Context)
Citation Context ...fsky, 1984, 1985b, 1986; Shepard, 1957, 1958b). It is now widely suspected, however, that standard distance-based similarity measures do not provide an adequate account of perceived similarity (e.g., =-=Krumhansl, 1978-=-; Tversky, 1977). Our approach takes the opposite tack. We begin with a very powerful and general theory of recognition and use it to derive a new similarity measure, which successfully accounts for a... |

89 | Attention and learning processes in the identification and categorization of integral stimuli
- Nosofsky
- 1987
(Show Context)
Citation Context ... significant at a = .05. Values with a subscript b are significant at a = .01. metrics (e.g., Attneave, 1950; Garner, 1974; Shepard, 1964; Torgerson, 1958; Tversky & Gati, 1982; for an exception, see =-=Nosofsky, 1987-=-). Ashby and Townsend (1986) presented independent evidence that the stimulus components used by Townsend et al. were separable. The Euclidean MDS-choice-model fits agree with results of Nosofsky (198... |

83 | Uncertainty and structure as psychological concepts - Garner - 1962 |

67 |
Multidimensional scaling
- Davison
- 1983
(Show Context)
Citation Context ...s points in a multidimensional metric space and assumes that judgments of the perceived similarity of two stimuli are inversely related to the distance between their perceptual representations (e.g., =-=Davison, 1983-=-; Kruskal, 1964a, 1964b; Shopard, 1962a, 1962b; Torgerson, 1958). This class of models, known as the geometric models of similarity, is contained within the larger class of multidimensional scaling (M... |

63 |
Similarity, Separability and the Triangle Inequality
- Tversky, Gati
- 1982
(Show Context)
Citation Context ...nition. It is frequently assumed that the greater the similarity between a pair of stimuli, the more likely one will be confused with the other in a recognition task (e.g., Luce, 1963; Shepard, 1964; =-=Tversky & Gati, 1982-=-). Yet despite the potentially close relationship between the two, there have been only a few attempts at developing theories that unify the similarity and recognition literatures. Most attempts to li... |

56 |
Dimensions of similarity
- Attneave
(Show Context)
Citation Context ...GGR stands for general Gaussian recognition. MDS = multidimensional scaling. Values with a subscript a are significant at a = .05. Values with a subscript b are significant at a = .01. metrics (e.g., =-=Attneave, 1950-=-; Garner, 1974; Shepard, 1964; Torgerson, 1958; Tversky & Gati, 1982; for an exception, see Nosofsky, 1987). Ashby and Townsend (1986) presented independent evidence that the stimulus components used ... |

47 | Visual sensitivities to color differences in daylight - MacAdam - 1942 |

46 |
Basic measurement theory
- SUPPES, ZIrNEs
- 1963
(Show Context)
Citation Context ...le and weighted Euclidean models), but violations of the triangle inequality do not. MacKay and Zinnes (1981; see also Ennis & Mullen, 1986; Hefner, 1958; Luce & Galanter, 1963; Mullen & Ennis, 1987; =-=Suppes & Zinnes, 1963-=-; Zinnes & MacKay, 1983) introduced a probabilistic version of MDS. They started with the traditional Euclidean model but assumed that the stimulus coordinates are normally and independently distribut... |

42 |
Structural bases of typicality effects
- Rosch, Simpson, et al.
- 1976
(Show Context)
Citation Context ...presentations. This model is a straightforward generalization of prototype models of categorization (e.g., Ashby & Gott, 1988; Homa, Sterling, & Trepel, 1981; Posner & Keele, 1968, 1970; Rosch, 1973; =-=Rosch, Simpson, & Miller, 1976-=-), and so we will refer to it as the prototypebased MDS model. Because the stimulus means are identical in all three Experiment 1 conditions, their perceptual representations should also be identical,... |

41 |
Limitations of exemplar-based generalization and the abstraction of categorical information
- Homa, Sterling, et al.
- 1981
(Show Context)
Citation Context ...ents of category similarity are based on the distance between category representations. This model is a straightforward generalization of prototype models of categorization (e.g., Ashby & Gott, 1988; =-=Homa, Sterling, & Trepel, 1981-=-; Posner & Keele, 1968, 1970; Rosch, 1973; Rosch, Simpson, & Miller, 1976), and so we will refer to it as the prototypebased MDS model. Because the stimulus means are identical in all three Experiment... |

32 |
Weighting common and distinctive features in perceptual and conceptual judgements. Cognitive
- Gati, Tversky
- 1984
(Show Context)
Citation Context ...generalize Equation 13 somewhat. For example, celery, apples, and automobiles are never confused, and yet we judge celery and apples to be more similar than celery and automobiles (see Appendix A and =-=Gati & Tversky, 1984-=-, for other failures of the assumption). In Appendix A we present a generalization of Equation 13 that can account for this and many other examples in which similarity is not a strictly increasing fun... |

32 | Overall similarity and the identification of separable-dimension stimuli: A choice model analysis
- NosoFsKY
- 1985
(Show Context)
Citation Context ...ing ~ in Equation 11 with exp(-d~), where d~.is the distance between the perceptual representations of stimuli Si and Sj. The resulting model, which has come to be known as the MDSchoice model (e.g., =-=Nosofsky, 1985-=-a, 1985b, 1986) predicts that m ~i exp(-dl) (12) P(RjlSi) = ~ ~m exp(-dim) " m Many different versions of this model can be formulated. Among the simplest and most obvious is one based on simples128 E... |

31 | The choice axiom after twenty years
- Luce
- 1977
(Show Context)
Citation Context ...model that has been most successful in predicting a wide variety of confusion matrices over the last 20 years is the biasedchoice model (Luce, 1963; Shepard, 1957; but see also, e.g., Holbrook, 1975; =-=Luce, 1977-=-; Townsend, 1971; Townsend & Ashby, 1982). In the biased-choice model, P(RjlSi) is a function of the similarity of stimulus Si to stimulus Sj, denoted ~, and of the bias toward response Rj, denoted/~j... |

28 |
Stimulus and response generalization: tests of a model relating generalization
- Shepard
- 1958
(Show Context)
Citation Context ...can be formulated. Among the simplest and most obvious is one based on simples128 E GREGORY ASHBY AND NANCY A. PERRIN Euclidean distances. This version has been investigated by several authors (e.g., =-=Shepard, 1958-=-b; Takane & Shibayama, 1985). Nosofsky (1984, 1985b, 1986, 1987) generalized the model to account for categorization data, and he also considered other Minkowski distance metrics. In addition, Nosofsk... |

27 |
Stimulus and response generalization: Deduction of the generalization gradient from a trace model
- Shepard
- 1958
(Show Context)
Citation Context ...can be formulated. Among the simplest and most obvious is one based on simples128 E GREGORY ASHBY AND NANCY A. PERRIN Euclidean distances. This version has been investigated by several authors (e.g., =-=Shepard, 1958-=-b; Takane & Shibayama, 1985). Nosofsky (1984, 1985b, 1986, 1987) generalized the model to account for categorization data, and he also considered other Minkowski distance metrics. In addition, Nosofsk... |

24 |
Models and methods for three-way multidimensional scaling
- Carroll, Wish
- 1974
(Show Context)
Citation Context ...reported data in which this assumption appears to be violated. The weighted Euclidean model was generalized to allow for perceptual dependencies by Tucker (1972) and Carroll and Chang (1972; see also =-=Carroll & Wish, 1974-=-). Their idea (see also Tanner, 1956) was that the degree of perceptual dependence should be related to the angle between dimensions (see Ashby & Townsend, 1986), and so the resulting model, known as ... |

22 |
Psychological scaling without a unit of measurement
- Coombs
(Show Context)
Citation Context ...tion theory can also account for preference judgments. Not only does it contains146 E GREGORY ASHBY AND NANCY A. PERRIN many powerful preference models as special cases (e.g., the unfolding theory of =-=Coombs, 1950-=-, and Bennett & Hays, 1960), but it can also account for empirical results that cause other theories much difficulty (e.g., multiple-peaked preference functions). We feel, however, that the most impor... |

22 | Attention bands in absolute identification
- Luce, Green, et al.
- 1976
(Show Context)
Citation Context ...ends to a stimulus dimension the fA fB Figure 3. Contours of equal probability for a case in which the general recognition theory predicts a violation of symmetry. smaller the perceptual noise (e.g., =-=Luce, Green, & Weber, 1976-=-). Also note that these mappings provide justification for using the angle 0 as a measure of perceptual independence, because 0 = 90* if and only if pxy = 0, which in the special Gaussian case implies... |

22 |
Theoretical analysis of an alphabetic confusion matrix
- Townsend
- 1971
(Show Context)
Citation Context ...as been most successful in predicting a wide variety of confusion matrices over the last 20 years is the biasedchoice model (Luce, 1963; Shepard, 1957; but see also, e.g., Holbrook, 1975; Luce, 1977; =-=Townsend, 1971-=-; Townsend & Ashby, 1982). In the biased-choice model, P(RjlSi) is a function of the similarity of stimulus Si to stimulus Sj, denoted ~, and of the bias toward response Rj, denoted/~j. Specifically, ... |

19 |
Representations of qualitative and quantitative dimensions
- Gati, Tversky
- 1982
(Show Context)
Citation Context ...re of the relevant features, then the theory becomes much easier to test. For example, in the case in which stimulus features can be identified and experimentally manipulated, Tversky and Gati (1982; =-=Gati & Tversky, 1982-=-) identified three ordinal properties that characterize what they called a "monotone proximity structure" If ansaxiom or property is ordinal, then it holds for judged dissimilarity if and only if it h... |

18 | Discrimination and generalization in identification and classification: Comment on Nosofsky - SHEPARD - 1986 |

17 | Multivariate statistical methods (2nd ed - Morrison - 1976 |

15 |
On the prediction of confusion matrices from similarity judgments
- Getty, Swets, et al.
- 1979
(Show Context)
Citation Context ...n literatures. Most attempts to link the two have used a distance-based similarity measure to predict the confusions in recognition experiments (Appelman & Mayzner, 1982; Getty, Swets, & Swets, 1980; =-=Getty, Swets, Swets, & Green, 1979-=-; Nakatani, 1972; Nosofsky, 1984, 1985b, 1986; Shepard, 1957, 1958b). It is now widely suspected, however, that standard distance-based similarity measures do not provide an adequate account of percei... |

12 |
Recognition models of alphanumeric characters
- Keren, Baggen
- 1981
(Show Context)
Citation Context ...ity function, ~ = exp[-d~], and one with a Gaussian function, ~ = exp[-dij2]), the city-block MDS-choice model, and the unique-feature model developed from Tversky's (1977) feature-contrast model (by =-=Keren & Baggen, 1981-=-; Smith, 1982; Takane & Shibayama, 1985). In the Townsend et al. (1980, 1981) experiment, two levels (presence and absence) of two stimulus components (a horizontal and a vertical line segment) were f... |

11 |
Signal detection theory andpsychophysics
- Green, Swets
- 1966
(Show Context)
Citation Context ...ing Experiment 1. Correspondence concerning this article should be addressed to E Gregory Ashby, Department of Psychology, University of California, Santa Barbara, California 93106. 124 theory (e.g., =-=Green & Swets, 1966-=-). We use the word recognition here in the sense of Tanner (1956) and Luce (1963), although the experimental paradigm we have in mind might be better described as identification. The important link th... |

8 |
Multidimensional unfolding: determining the dimensionality of ranked preference data
- Bennett, Hays
(Show Context)
Citation Context ...so account for preference judgments. Not only does it contains146 E GREGORY ASHBY AND NANCY A. PERRIN many powerful preference models as special cases (e.g., the unfolding theory of Coombs, 1950, and =-=Bennett & Hays, 1960-=-), but it can also account for empirical results that cause other theories much difficulty (e.g., multiple-peaked preference functions). We feel, however, that the most important contribution of the t... |

8 |
Experimental test of contemporary mathematical models of visual letter recognition
- Townsend, Ashby
- 1982
(Show Context)
Citation Context ...cessful in predicting a wide variety of confusion matrices over the last 20 years is the biasedchoice model (Luce, 1963; Shepard, 1957; but see also, e.g., Holbrook, 1975; Luce, 1977; Townsend, 1971; =-=Townsend & Ashby, 1982-=-). In the biased-choice model, P(RjlSi) is a function of the similarity of stimulus Si to stimulus Sj, denoted ~, and of the bias toward response Rj, denoted/~j. Specifically, ~j~ij (11) P(Rj[Si) = ~ ... |

7 |
Psychophysical scaling
- Luce, Galanter
- 1963
(Show Context)
Citation Context ...an scaling model (and therefore also the simple and weighted Euclidean models), but violations of the triangle inequality do not. MacKay and Zinnes (1981; see also Ennis & Mullen, 1986; Hefner, 1958; =-=Luce & Galanter, 1963-=-; Mullen & Ennis, 1987; Suppes & Zinnes, 1963; Zinnes & MacKay, 1983) introduced a probabilistic version of MDS. They started with the traditional Euclidean model but assumed that the stimulus coordin... |

7 |
Theory of recognition
- Tanner
- 1956
(Show Context)
Citation Context ...rs to be violated. The weighted Euclidean model was generalized to allow for perceptual dependencies by Tucker (1972) and Carroll and Chang (1972; see also Carroll & Wish, 1974). Their idea (see also =-=Tanner, 1956-=-) was that the degree of perceptual dependence should be related to the angle between dimensions (see Ashby & Townsend, 1986), and so the resulting model, known as the general Euclidean scaling model,... |

7 | Perceptual sampling of orthogonal straight line features - Townsend, Hu, et al. - 1981 |

6 |
Relations between multidimensional scaling and three-mode factor analysis
- Tucker
(Show Context)
Citation Context ...ed similarity is determined by distributional overlap. Under certain, very special conditions, overlap and distance measures agree, and thus the general Euclidean scaling model (Carroll & Wish, 1974; =-=Tucker, 1972-=-) is contained within the general recognition theory as a special case. On the other hand, because it is not constrained by any of the distance axioms, the general recognition theory is much more powe... |

6 | Measurement of small color differences
- Wandell
- 1982
(Show Context)
Citation Context ...ee also Hefner, 1958; Torgerson, 1958; Zinnes & MacKay, 1983), but it can also be viewed as a multidimensional generalization of signal-detection theory (see, e.g., Green & Swets, 1966; Tanner, 1956; =-=Wandell, 1982-=-). The contours of equal probability of a bivariate normal distribution are always ellipses or circles. Their shape is determined by the variances and by the covariance (or correlation) parameters. ~ ... |