Remembering over the short-term: The case against the standard model. (2002)
Venue: | Annual Review of Psychology, |
Citations: | 74 - 0 self |
BibTeX
@ARTICLE{Nairne02rememberingover,
author = {James S Nairne},
title = {Remembering over the short-term: The case against the standard model.},
journal = {Annual Review of Psychology,},
year = {2002},
pages = {53--81}
}
OpenURL
Abstract
s Abstract Psychologists often assume that short-term storage is synonymous with activation, a mnemonic property that keeps information in an immediately accessible form. Permanent knowledge is activated, as a result of on-line cognitive processing, and an activity trace is established "in" short-term (or working) memory. Activation is assumed to decay spontaneously with the passage of time, so a refreshing processrehearsal-is needed to maintain availability. Most of the phenomena of immediate retention, such as capacity limitations and word length effects, are assumed to arise from trade-offs between rehearsal and decay. This "standard model" of how we remember over the short-term still enjoys considerable popularity, although recent research questions most of its main assumptions. In this chapter I review the recent research and identify the empirical and conceptual problems that plague traditional conceptions of short-term memory. Increasingly, researchers are recognizing that short-term retention is cue driven, much like long-term memory, and that neither rehearsal nor decay is likely to explain the particulars of short-term forgetting. CONTENTS INTRODUCTION How do we remember over the short term? It is clearly adaptive to keep recent information available in some kind of accessible form. It would be difficult to comprehend spoken language, which occurs sequentially, or read any kind of text without remembering the early part of an utterance or the themes relevant to a passage. Virtually all complex cognitive activities-reading, reasoning, problemsolving-require access to intermediate steps (as in adding or multiplying twodigit numbers in the head) or other situation-specific information For many years psychologists have essentially agreed about the main mechanism controlling the temporary storage of information. The generally accepted view-referred to here as the standard model-is that short-term storage arises from activation, a mnemonic property that keeps information in an immediately accessible form. Permanent knowledge is activated, as a byproduct of on-line cognitive processing, and comes to reside "in" short-term (or working) memory. Short-term memory, as a whole, is simply defined as the collective set of this activated information in memory (e.g., The simplicity of the standard model is clearly a virtue, but is it empirically justified? On the surface, the standard model violates a number of well-known tenets of memory theory. For example, decay has been roundly rejected as a vehicle for long-term forgetting for decades, certainly since the seminal arguments of John McGeoch in the 1930s It is possible that remembering over the short term is a special case, requiring theoretical proposals that do not apply in more general arenas, but the empirical Annu. Rev. Psychol. 2002.53:53-81. Downloaded from arjournals.annualreviews.org by PURDUE UNIVERSITY LIBRARY on 06/26/07. For personal use only. SHORT-TERM MEMORY 55 case should be strong. As I review in the next section, the empirical case for the standard model did, in fact, once seem very strong (see also THE STANDARD MODEL: REHEARSAL PLUS DECAY As noted above, activation is the vehicle for temporary storage in the standard model. Items, once activated, are assumed to exist in a state of immediate accessibility (McElree 1998); the amount of activation, in turn, accrues from a continual trade-off between rehearsal and decay. This account has intuitive appeal-it maps well onto phenomenological experience-and it is easily expressed with concrete metaphors. For instance, think about a juggler trying to maintain a set of four plates. Tossing a plate can be seen as a kind of activation; the height of the toss corresponds roughly to the amount of activation achieved. The juggler is able to maintain a set of activated items (plates) to the extent that each can be caught and re-tossed before gravity reduces it to an irretrievable state. The juggling metaphor is apt because it can be naturally extended to most of the well-known phenomena of immediate memory (see The main assumptions of the standard model-activation, rehearsal, and decayare prominent in a host of current theoretical accounts of short-term retention. Probably the best-known example of a "juggler" model is the working memory model of 56 NAIRNE is divided into two parts: a phonological store, which is the storage location for activated information, and a rehearsal/recoding device called the articulatory control process. Information in the phonological store is assumed to decay in roughly 2 s (which is analogous to a constant force of gravity) and can be refreshed, via rehearsal, by the articulatory control process (tossing plates into the air). Capacity limitations in immediate retention-e.g., the magic number seven-are assumed to arise from trade-offs between decay and loop-based rehearsal. There are many variants of the working memory model in the current literature (see Finally, Articulation Rate and Span Empirically, the assumptions of the standard model receive their most convincing support from studies showing a systematic relationship between overt articulation rate and memory span. Articulation rate is assumed to correlate with the speed of internal rehearsal This relation between articulation rate and memory span is well documented and noncontroversial, although some questions remain about the exact form Annu. Rev. Psychol. 2002.53:53-81. Downloaded from arjournals.annualreviews.org by PURDUE UNIVERSITY LIBRARY on 06/26/07. For personal use only. SHORT-TERM MEMORY 57 of the function-e.g., linear or possibly quadratic Collectively, these data seem to offer compelling support for the standard model. Articulation rate, and inferentially the speed of internal rehearsal, varies in a more or less direct way with memory span. In the absence of rehearsal, immediate memory performance declines and no longer shows the clean connection with spoken word duration. However, as we shall see, the empirical case loses much of its vigor on closer inspection. Over the past decade, a number of important exceptions to the data patterns listed above have appeared: For example, it turns out that articulation rate and span are not always closely related, and spoken duration effects sometimes emerge in the absence of rehearsal. Moreover, the theoretical assumptions of the standard model, when examined closely, turn out to contain an unsettling amount of ambiguity. In the sections that follow, I examine the discrepant data-first for rehearsal and then for the process of decay-and discuss the conceptual problems that remain unresolved. PROBLEMS WITH REHEARSAL I begin by discussing some of the empirical difficulties associated with the rehearsal arm of the standard model. At the outset, it is worth noting that proponents of the standard model (e.g., Baddeley 1992) rarely, if ever, specify the dynamics of the rehearsal process in any kind of systematic way. As 58 NAIRNE it is clearly necessary to know exactly when such rehearsal is taking place" (Brown & Hulme 1995, p. 599). Yet, the protocol of covert rehearsal is likely to change with a host of variables, such as list length, modality of presentation, presentation rate, lexicality, or interitem similarity. Does rehearsal occur only during stimulus input, or does it also occur during response output? Does it matter whether one rehearses based primarily on sound, or is it necessary to access meaning in order to maintain the activation of permanent knowledge? The idea of the standard "rehearsal + decay" model seems simple enough, but its actual implementation is likely to be quite complex. Dissociating Rehearsal and Span In its simplest form, the standard model assumes that most of the variability in immediate retention is attributable to rehearsal. Decay is presumably fixed, like gravity, so effective remembering over the short term hinges on one's ability to maintain activation in the face of decay. Such an account is clearly undermined by the fact that recall differences can emerge when articulation rates are held constant. For example, Other researchers have found that manipulations of item duration sometimes fail to produce concomitant changes in immediate memory performance. SHORT-TERM MEMORY 59 (2000) varied spoken duration in disyllabic words, holding constant a host of potentially confounding factors (e.g., frequency, familiarity, number of phonemes) and failed to find any advantage for short-duration words across several experiments; word duration effects emerged only when the original word pools used by Other evidence can be marshaled in support of articulatory processes in immediate retention, although the implications of the data are unclear. For example, Wilson & Emmorey (1998) recently found a duration-based word length effect using word lists generated from American Sign Language. Lists composed of short signs were remembered better than lists of long signs under conditions in which "length" could be attributed uniquely to the duration of the signing process. Of course, one can question the relevance of these data to immediate retention and rehearsal of verbal lists, although Eliminating Rehearsal As noted earlier, the fact that item-duration effects can be eliminated under articulatory suppression is usually seen as strong support for the rehearsal arm of the model. 60 NAIRNE articulatory suppression throughout the recall period Word length effects also occur under conditions of extremely rapid visual presentation. Coltheart & Langdon (1998) presented lists of either short or long words at rates approximating 100 msec/item. Presenting an entire list in well under a second, it was argued, should make normal rehearsal virtually impossible. Yet, a highly reliable word length effect was found; moreover, the word length effect was eliminated, under these same rapid presentation conditions, when subjects engaged in concurrent articulatory suppression. Word length effects are also found in patients suffering from serious articulatory difficulties. At face value, these data seem troubling for the standard model. In the absence of rehearsal, word length effects can remain. However, the standard model does not necessarily predict that the word length effect should be eliminated under these conditions: Long words, by definition, are associated with longer presentation durations. This means that one would expect the word length effect to disappear in the absence of rehearsal only when the total presentation time has been equated for short and long lists. Suppose it takes 1 s to present a long word and only 0.5 s to present a short word. For pure lists of any length, then, the retention interval for a given long item in the list, with the exception of the last list item, will be longer than for the comparable item in the short list (e.g., for lists of 6 items, 5 s will elapse before the first long item can be recalled versus only 2.5 s for the first short item). Assuming decay, we would expect a word length effect under these circumstances, even without considering rehearsal as a factor (or by assuming that all words are rehearsed only once). In many experimental studies of the word length effect, item presentation rates are controlled (e.g., 1-2 s per item) so the total passage of time Annu. Rev. Psychol. 2002.53:53-81. Downloaded from arjournals.annualreviews.org by PURDUE UNIVERSITY LIBRARY on 06/26/07. For personal use only. SHORT-TERM MEMORY 61 elapsing during presentation is, in principle, controlled. However, as reviewed in the next section, the main locus of the word length effect appears to be at outputduring recall, when output timing is rarely controlled experimentally. Output Effects There is considerable evidence to suggest that word length effects originate, at least in part, during recall output. First, as discussed above, when list items are presented aloud, articulatory suppression eliminates the word length effect only if it occurs during output. Importantly, Of course, pinpointing the locus of word length effects at output does not mean that rehearsal is responsible. Again, the raw passage of time could be critical: Recalling long words first imposes more of an output delay on the remaining words than recalling short words first. In fact, when spoken recall protocols are analyzed in microscopic detail, it turns out that both the duration of the recall response and the pauses that occur between recall responses are importantly related to span. A number of studies have found that interword pause durations correlate significantly with span (e.g., 62 NAIRNE Finally, if the durations of the output pauses are too short for rehearsal, then when can rehearsal be occurring during output? There are two remaining possibilities: Rehearsal could be occurring during the actual output responses or during the preparatory interval preceding recall. The former case seems unlikely-in fact, one could argue that rehearsal at any point after output has begun is likely to have a detrimental effect because one would need to coordinate two sets of items, the output set and the rehearsal set (Avons et al. 1994). It is more likely that some rehearsal occurs during the preparatory interval immediately preceding recall. However, these intervals usually show no correlation with either memory span or articulation rate (e.g., Summary The widely held belief that internal rehearsal plays a large role in remembering over the short term, refreshing activation, rests primarily on the well-documented relationship between articulation rate and span. However, as the preceding section indicates, there are many reasons to question whether the correlation between articulation and span is actually caused by rehearsal. Perhaps most telling are the repeated failures to replicate item duration effects when other potentially confounding factors are controlled (e.g., It is also well documented that many item-based differences in immediate retention remain when (a) rehearsal is disrupted or blocked and (b) articulation rates are held constant. In the former case, the passage of time could be held responsiblethat is, through differential decay-but there is no simple way for the standard model to explain the large span differences found at matched articulation rates. Thus, the span advantage seen for words over nonwords, high-frequency over lowfrequency words, or concrete over abstract words must be attributable to factors other than those contained in the rehearsal plus decay assumptions of the standard model. What then accounts for the overall relationship between measures of articulation and immediate retention? One possibility is that there is some unspecified third factor that is associated with both rapid articulation ability and memory skill. Cowan (1999) recently suggested that "the ability to read or extract information from phonological memory quickly" might be involved; PROBLEMS WITH DECAY The second arm of the standard model is decay, defined as the loss of trace information exclusively as a function of time. Activation, the vehicle for short-term storage, is believed to decay spontaneously, much like a plate tossed into the air falls spontaneously from the force of gravity. Of course, the fact that forgetting proceeds with time is self-evident-but is it really the passage of time that causes the forgetting? Time is correlated with forgetting, but it may be the events that happen in time that are actually responsible for the loss. The famous analogy used by With the exception of short-term memory environments, decay is rarely, if ever, used by memory theorists seeking to explain forgetting. There are both empirical and theoretical reasons to reject the concept. First, long-term retention can stay constant, or even improve, with the passage of time (e.g., spontaneous recovery and reminiscence). Second, the rate and extent of forgetting often depend on the specific activities that occur during the retention interval. For example, the similarity between original and interpolated learning determines how much forgetting one sees for the original material (e.g., Osgood 1953). Neither of these empirical results can be explained through a simple appeal to decay. Importantly, as discussed below, both of these results apply to remembering over the short term. Theoretically, the concept of decay is equally troubling. To propose that memories are lost spontaneously with time ignores the potential contribution of the retrieval environment (e.g., Tulving 1983). In the standard model remembering is essentially tied to a property of the trace-activation-and no claims are made about the effectiveness of retrieval cues. One could assume that short-term memory is also influenced by the availability of retrieval cues (e.g., Nairne 1990a, 2001), but Annu. Rev. Psychol. 2002.53:53-81 64 NAIRNE this undercuts the explanatory power of the decay concept. Trace features might be lost over time, in a process akin to radioactive decay, but the effects on memory will depend on the cues present at the time of test. In principle, losing trace features could produce a trace that is actually more compatible with the cues present at time 2 than at time 1, producing little loss or even improved retention. Such a view is widely accepted in the study of long-term memory, in which the passage of time per se is rejected as a sufficient condition for forgetting, but similar analyses are rarely, if ever, made in the short-term memory literature. Why does decay remain popular? Part of the reason may be phenomenological: We have all experienced the rapid loss of information from consciousness, although the specific cause of the loss is not consciously apparent. More likely, though, the reason is historical. In the original Brown-Peterson experiments Dissociating Time and Forgetting In fact, it is relatively easy to falsify simple versions of decay theory. If time passes and there is no memory loss, or perhaps an improvement in retention, then factors other than decay must be operating. In the standard model rehearsal counteracts decay through reactivation of the trace. However, a number of studies have found little, or no, forgetting in contexts in which rehearsal is unlikely. For example, Greene (1996) found no evidence for loss in the traditional BrownPeterson paradigm when distractor length was manipulated between subjects (see also SHORT-TERM MEMORY 65 interference by using different items on every trial and reconstruction of order as the retention measure; in a reconstruction test the just-presented items are given back in random order and the task is to place the items into their original presentation order. Under these conditions very little evidence of forgetting was found across retention intervals ranging from 2 to 96 s-e.g., reconstruction performance averaged 78% correct after 2 s of distraction and 73% correct after 96 s. Historically, proponents of the standard model have explained data like these by appealing to long-term memory (e.g., On reflection, it is difficult to see why this account has such wide appeal. First, there is no direct evidence confirming that subjects actually do shift from long-to short-term retrieval within an experimental session. Moreover, found that the phonological similarity effect remains robust when interference from prior trials is minimized. If subjects tend to rely on recovery from long-term memory in an "uncluttered" environment, then phonological similarity should play a reduced role (i.e., because long-term retrieval tends to be influenced more by semantic factors). Second, and more troubling, short-term memory is assigned a kind of "back-up" status in this account-that is, something to be used only when recovery from long-term memory is problematic. It suggests that a subject's first line of defense is to recover information from long-term memory; recovery from short-term memory occurs only when long-term traces are difficult to access. Yet, of course, in most natural environments people have not been exposed to multiple "lists" of to-be-remembered information. Thus, in the prototypical case-remembering a telephone number-it is not clear why short-term memory would even be needed. A better explanation is given by general distinctiveness accounts, which assume that forgetting is caused by a failure to discriminate current trial information from information presented on previous trials (e.g., 66 NAIRNE activity for group-specific intervals (e.g., one group counted for 10 s, another for 15 s, and another for 20 s). Foreshadowing Greene (1996), equivalent amounts of forgetting were found across groups in this between-subject design (0.33, 0.30, and 0.30, respectively). On the critical trial, however, all groups were switched to the same 15 s distractor period. Retention performance dropped in the 10-s group (from 0.33 to 0.20), stayed roughly constant in the 15-s group (0.30 to 0.28) and improved in the 20-s group (0.30 to 0.38). Note that the passage of time-and therefore the opportunity for decay-was equated across the groups on the critical 15-s trial, yet performance depended on the timing of prior trials. One can parsimoniously explain these data by assuming that the relative, rather than absolute, durations of distractor activity affect the discriminability of target information at test (Baddeley 1976). Memory improves whenever target items can be easily discriminated from the memories created by prior trials. According to distinctiveness accounts, it is the ratio of the interpresentation interval to the length of the current retention interval that matters. In the Turvey et al. (1970) case, the interpresentation interval corresponds to the period separating the presentation of the to-be-remembered information on trial N-1 from the presentation on trial N. Note that on the critical trial containing 15 s of distraction, prior trial information is relatively "closer" in time for the 10-s group (10/15 = 0.67) than for the 15-s group (15/15 = 1.0) or the 20-s group (20/15 = 1.33). The actual data are predicted well by these ratios. The The Role of Intervening Activity The preceding studies establish quite clearly that the passage of time is not necessarily a predictor of memory loss. Time is often correlated with forgetting, but the correlation is far from perfect. The exceptions are important, though, because they help to falsify simpleminded notions of decay. Decay theories also have trouble explaining why forgetting can depend on the specific activities that occur during the retention interval. In his original work on the distractor paradigm, For example, significantly more forgetting is found in a short-term memory environment when the modalities of presentation and distraction match. More forgetting is found when target items are presented aloud and the distractor activity Annu. Rev. Psychol. 2002.53:53-81. Downloaded from arjournals.annualreviews.org by PURDUE UNIVERSITY LIBRARY on 06/26/07. For personal use only. SHORT-TERM MEMORY 67 is also auditory Additional support for modality-specific interference comes from the extensive literature on the suffix effect (for reviews see Several investigators have attempted to reduce the similarity between the target material and the distractor task to such an extent that interference between the two seems unlikely (e.g., This time-based performance loss seems consistent with decay theory. Note that a comparable result would not be found with verbal material because the subject could presumably rehearse the material during the retention interval. The fact that significant loss occurred, with essentially no interfering material present (and rehearsal possible, in principle), suggests that the forgetting was due uniquely to the passage of time. However, in a subsequent reanalysis of their data, 68 NAIRNE Summary In his famous attack on Thorndike's "law of disuse," As the preceding review indicates, both of the main arguments used by However, it is difficult to see how proponents of decay can explain all the situations in which time exerts no influence. For instance, as reviewed in the previous section, current research indicates that it is complexity, rather than actual duration, that mediates most word length effects in immediate recall. Again, long durations can actually lead to better immediate memory in some situations. Moreover, several researchers have reported that output duration is a better predictor of memory span than presentation duration, and the typical output period far exceeds the 2-s decay window (e.g., SHORT-TERM MEMORY 69 the data of Baddeley & Scott (1971)-limited forgetting over the first few seconds of distraction-can be explained by appealing to intrasequence interference; that is, interference produced among the items in a current to-be-remembered list (see also Melton 1963). As I discuss below, one of the major advantages that models of this type have over the standard model is their assumption that remembering is cue driven. RECONCEPTUALIZING SHORT-TERM MEMORY In this section I discuss alternatives to the standard model. As noted throughout, one of the most important conceptual hurdles faced by activation accounts is the proposal that recovery from short-term memory is essentially independent of cueing. Kintsch and colleagues recently noted that questions about retrieval touch on "the essence of working memory because of the common assumption that information 'in' working memory is directly and effortlessly retrievable" Cue-Driven Immediate Retention There are many factors that have influenced the movement toward cue-driven accounts of immediate retention. One factor is the sensitivity of immediate recall to item-specific variables such as lexicality, word frequency, and concreteness. As noted above, in the standard model there is no obvious reason why lexicality or concreteness should affect the availability of an item, once activated, yet each Annu. Rev. Psychol. 2002.53:53-81 70 NAIRNE produces large and consistent effects on immediate recall. Researchers generally refer to these effects as long-term memory contributions to immediate memory, and they are increasingly assumed to arise from a redintegration process wherein the decayed or degraded immediate memory trace is used as a cue to sample an appropriate candidate from long-term memory (e.g., Strong support for the cueing interpretation of proactive interference comes from studies manipulating the nature of the cues at test. For example, one can obtain release from proactive interference at the point of test, after the list has been presented, if discriminating cues are provided Evidence for cue-driven immediate retention also comes from the analysis of errors in immediate recall. Errors are usually not random, but rather follow certain rules, suggesting that position of occurrence may be an important retrieval cue. For example, when an item is recalled in the wrong serial position in immediate recall, it tends to be placed in a nearby position. One typically finds regular error gradients that drop off with distance from the original position of occurrence Annu. Rev. Psychol. 2002.53:53-81 1991). These data suggest that people are not simply outputting activated items directly from short-term memory, but rather are using position of occurrence as a retrieval cue to decide what happened moments before. Although there may be disagreements about which cues predominate, collectively these data have led many short-term memory theorists to conclude that remembering over the short term is definitely cue driven. As I discuss below, recent formal models of immediate retention tend to be cue driven, although some are hybrid models that assume that some form of "direct retrieval" from short-term memory is possible. Even accounts that closely mimic the assumptions of the standard model, such as Baddeley's working memory model (Baddeley 1986), recognize that direct retrieval cannot explain all the particulars of immediate retention. For example, it is difficult to derive the phonological similarity effect-poorer memory for lists composed of similar sounding items-from the assumptions of the standard model. Working memory proponents generally assume that memory is impaired because recall requires ". . . discrimination among the memory traces . . . similar traces will be harder to discriminate, leading to a lower level of recall" Hybrid Models The term hybrid model refers to a class of current models that retain important elements of the standard model-e.g., activation-based remembering, rehearsal, and/or decay-but assume that retrieval cues play an important role in short-term remembering as well. A detailed review of these models is beyond the scope of this article, so I provide only a few brief descriptions here. Schweickert (1993) has proposed a multinomial processing tree model that closely mimics the standard model except for the proposal of an additional redintegration stage. List presentation leads to the formation of active traces in short-term memory, which over time become degraded. Schweickert is noncommittal about the actual process controlling degradation, assuming that either decay or interference may contribute. During recall, the subject's first line of attack is a "direct readout" of the active trace, as in the standard model, which occurs successfully with probability I. If direct readout fails (with probability 1-I), an attempt is made to interpret the degraded trace through redintegration. This second stage of Schweickert's model-the interpretation or redintegration stage-provides a vehicle for explaining many of the findings discussed above that have proven troubling for the standard model. The degraded trace becomes a cue of sorts that is interpreted by accessing long-term knowledge, particularly about language processing (see also 72 NAIRNE that the effects of lexicality, word frequency, and concreteness are presumed to occur. For example, the degraded traces for words are presumably easier to interpret than those for nonwords, leading to the lexicality effect in immediate recall. By placing the locus in the redintegration stage, the model is able to dissociate socalled long-term memory influences, such as lexicality and frequency, from the effects of articulation rate, which are presumed to primarily affect the direct readout stage. One of the advantages of the multinomial tree model is its clear predictions about how different variables should interact. Word length and lexicality are assumed to selectively influence different stages of the recall process: Word length affects the probability of trace degradation (for the same reasons described by the standard model), and lexicality affects the probability of trace interpretation. As a result, these factors, when combined factorially, are expected to produce an underadditive interaction in correct recall. Just this pattern has been obtained in relevant studiesi.e., the size of the word length effect is smaller for words than nonwords The "start-end" model proposed recently by To handle time-and item-based effects, Henson (1998) appeals to the main assumptions of the standard model. In particular, he assumes that "each presentation and rehearsal of an item activates its phonological representation to a fixed amount that subsequently undergoes exponential decay" (Henson 1998, p. 106). The activation process increases the probability of recall directly, by bringing an item closer to its recall threshold, although it can increase the chances of phonological confusions as well (thus explaining the phonological similarity effect). Word length effects are explained by appealing to the dynamics of rehearsal: Long words tend to receive less activation, because they cannot be rehearsed as efficiently, and Annu. Rev. Psychol. 2002.53:53-81. Downloaded from arjournals.annualreviews.org by PURDUE UNIVERSITY LIBRARY on 06/26/07. For personal use only. SHORT-TERM MEMORY 73 recall suffers as a result. Thus, as in the standard model, activation plays a role as a mnemonic property that is independent of any cueing mechanism. For the reasons described throughout this article, these assumptions of the start-end model are not well supported by the data. Unitary Models The two models just described are examples of current hybrid models. There have been other efforts to combine elements of the standard model with some kind of cue-based mechanism (e.g., Burgess & Hitch 1999), but I turn my attention now to models that contain virtually no assumptions in common with the standard model. These models assume no direct connection between activation level and memory success, propose little or no role for rehearsal, and reject the concept of decay in favor of item-based interference. I refer to these models as unitary models because they also assume similar processes for short-and long-term remembering (what differs is the retrieval cues in effect). Although not reviewed here, there is considerable evidence suggesting that short-and long-term memory often follow similar rules (e.g., Short-term forgetting in this model occurs because the available cues become poor predictors of the target items. Processing records are overwritten by subsequently occurring material (as a function of similarity), making it more difficult to interpret the records correctly. Basing interference on similarity enables the model to explain why performance can depend on the specific activities that occur during the retention interval, and it helps explain benchmark phenomena such as the phonological similarity effect as well. Overall, increasing the similarity among list items tends to reduce the predictive value of common features; any given residual cue tends to be predictive of several target items (it becomes overloaded), which lowers the chances of remembering a given target item in its correct position. Performance declines with increasing list length for a very similar reason. Cues in short-term memory are effective only to the extent that they are distinctive-that is, they uniquely predict target items. Annu. Rev. Psychol. 2002.53:53-81 74 NAIRNE Simulations of the feature model have been applied to most of the phenomena of immediate memory with success (see Nairne 1990a; Cue-dependent forgetting is also central to the OSCAR model proposed recently by The major difference between OSCAR and the loop model SHORT-TERM MEMORY 75 must differ in a fundamental way from the retrieval processes governing long-term retention. SUMMARY AND CONCLUSIONS The preceding two sections provide a brief and selective review of some recent formal models covering immediate retention. In each of these models remembering over the short term is assumed to be primarily cue driven, although rehearsal and decay-the two main assumptions of the standard model-contribute to performance in some cases. Focusing on cue-driven processes, as opposed to the direct retrieval of activated "items," offers a number of advantages. First, and perhaps most importantly, it lays the groundwork for a truly unified account of remembering. Virtually all researchers recognize that long-term remembering is cue driven; acknowledging that short-term remembering is cue driven as well helps explain why short-and long-term retention often show similarities, and it releases the theorist from the unreasonable assumption that activated traces have special properties outside of particular retrieval environments. More concretely, cue-driven accounts easily handle the item-specific long-term memory influences that characterize remembering over the short term. As we have seen, immediate retention is influenced by a number of variables, such as lexicality, word frequency, and concreteness, that are likely to affect one's ability to sample an appropriate recall candidate from long-term memory. In addition, it is easy to see how memory performance could decrease, stay constant, or even improve over time depending on the available constellation of retrieval cues. In unitary models, such as the feature model, short-term memory is simply conceived as a repository for cues that are used for reconstructing the immediate past. No items are stored-only feature-based cues that are, by themselves, not recallable. Such a conceptualization is vastly different from intuitive notions about activated items "sitting" in short-term memory awaiting direct recall. This kind of view also has no trouble handling the possibility that inhibitory effects will occur in immediate retention. It is almost certainly the case that item accessibility can be lowered over the short term-that is, you become less likely to remember an item-and there is no easy way to represent inhibition in the standard model. Rather than assuming an item is in some special state of inaccessibility, one can assume that there are simply cue constellations that reduce the likelihood of recovering an item as a possible response (response suppression mechanisms may also be at work in some instances). Note that inhibition conceived in this way is not a special state of the item; it is simply a byproduct of the particular cue constellation that happens to be driving memory. What role then should the standard model play in our efforts to understand how we remember over the short term? The juggler metaphor certainly has heuristic value, and it provides a nice organizational rubric for a variety of immediate memory phenomena. However, even as a heuristic, the standard model is misleading. It Annu. Rev. Psychol. 2002.53:53-81 76 NAIRNE leads one to the conclusion that forgetting rates are fixed, like gravity, rather than variable, as much of the data suggest. It also suggests that the main vehicle for short-term storage is rehearsal when, in fact, much of the variability in immediate retention turns out to be independent of rehearsal. Finally, it leads one to the conclusion that remembering is a direct byproduct of activation. Whereas it may be reasonable to propose activation in the brain, it is not activation per se that predicts performance. It is the interpretation of that activation, through a cue-driven retrieval process, that explains how we remember over the short term. ACKNOWLEDGMENTS