#### DMCA

## A comparative meta-analysis of Rorschach and MMPI validity. (1999)

Venue: | Psychological Assessment, |

Citations: | 29 - 1 self |

### BibTeX

@ARTICLE{Hiller99acomparative,

author = {Jordan B Hiller and Robert Rosenthal and Robert F Bornstein and David T R Berry and Sherrie Brunell-Neuleib},

title = {A comparative meta-analysis of Rorschach and MMPI validity.},

journal = {Psychological Assessment,},

year = {1999},

pages = {278--296}

}

### OpenURL

### Abstract

. We conducted a new meta-analysis comparing criterionrelated validity evidence for the Rorschach and the MMPI. The unweighted mean validity coefficients (rs) were .30 for MMPI and .29 for Rorschach, and they were not reliably different (p = .76 under fixed-effects model, p = .89 under random-effects model). The MMPI had larger validity coefficients than the Rorschach for studies using psychiatric diagnoses and self-report measures as criterion variables, whereas the Rorschach had larger validity coefficients than the MMPI for studies using objective criterion variables. The Rorschach Inkblot Method and the Minnesota Multiphasic Personality Inventory (MMPI) are the two most widely used instruments for the assessment of personality and psychopathology We express our thanks to Mark Hilsenroth and Radhika Krishnamurthy, who provided help with effect size selection for Rorschach and MMPI studies, respectively. We are grateful to Kevin Parker, who provided access to his meta-analytic database. Correspondence concerning this article should be addressed to Jordan B. Hiller, who is now at Abt Associates Inc., 55 Wheeler Street, Cambridge, Massachusetts 02138. Electronic mail may be sent to jordan_hiller@ abtassoc.com. groups. Second, two meta-analyses were published, both comparing criterion-related validity evidence for the Rorschach to that of its chief rival, the MMPI In spite of these developments, the debate about the reliability and validity of the Rorschach has continued Atkinson's (1986) meta-analysis used all conceptual Rorschach and MMPI studies (i.e., those guided by a priori hypotheses) listed in Psychological Abstracts for the years 1960, 1965, 1970, 1975, and 1980. Two strategies were used to evaluate validity evidence from these studies. When enough information was provided in study reports, effect sizes were computed (276 Rorschach effect 278 SPECIAL SECTION: RORSCHACH AND MMPI VALIDITY 279 sizes and 237 MMPI effect sizes). When effect sizes were not calculable, Atkinson calculated a ratio for each study of the number of statistical tests that were significant at p < .05 to the total number of statistical tests performed (39 Rorschach studies and 29 MMPI studies). These ratios, then, reflected the proportion of significant findings out of all tests computed in each study. Limitations of analyses using such "box-score" approaches are well known Problems with general meta-analytic technique were evident in the Atkinson (1986) meta-analysis. The effect sizes used (r 2 , a) 2 , Cramer's V) were not satisfactory, because they only assume positive values, and they cannot indicate whether the direction of a validity result is consistent with or opposite to the predicted association. 1 Thus, two studies in this meta-analysis with exactly contradictory results would nevertheless yield identical effect sizes. Measures such as o> 2 and Cramer's Vare also not appropriate for meta-analysis because they can be computed from unfocused significance tests (F with more than one df in the numerator and x 2 with more than one df, respectively). It is not clear whether such unfocused effect sizes entered the meta-analysis, but the use of Cramer's V rather than the simple <£ coefficient suggests that this is the case. Also, Atkinson sometimes extracted several effect sizes from individual studies and treated them as if they were independent. This procedure, although useful for some purposes, violates the assumption of independence among effect sizes, a violation that can lead to serious errors in the computation of significance levels The meta-analysis by A subsidiary table in 3 A further problem with co 2 is that it consistently underestimates the magnitude of results, especially when sample sizes are low. This is due to a correction factor in the formula for <u 2 , which is meant to adjust the statistic for chance levels of association. The effect of this correction factor is that even when r is .50 or higher, the value of o» 2 can still be zero, when sample sizes are modest; u 2 prevents results from properly contributing information to a meta-analysis. This effectively defeats one of the primary purposes of metaanalytic work, which is to aggregate effects accurately across studies. Thus, even if (a 2 had been used only for focused studies with appropriate signs, it would still be inappropriate for metaanalytic use. Garb et al. (1998) described other problems in the Parker et al. (1988) meta-analysis. Garb and colleagues concluded that some of the effect sizes used by Parker et al. tended toward zero when the validity of the test is supported (e.g., a near-zero effect size reflecting the fact that the number of responses on a Rorschach 1 Cramer's V was mistakenly identified as 8 in the article by 3 An example will illustrate the problem. If three groups of subjects are compared on a certain MMPI scale, then at 2 could be computed from the F test with 2 df in the numerator. How should the sign be determined for the effect size? Assuming that ties are impossible, there are six possible configurations of high, middle, and low mean scores for three groups. Allocation of two signs (+ or -) to the six patterns is not feasible or sensible. If one of the six patterns is truly indicative of test validity, then a more appropriate procedure is to compute the effect size from a contrast analysis with 1 df, and the sign of the t (and the effect size) would reflect the degree to which the outcome agreed or disagreed with the pattern specified by the contrast weights A conceptual issue that affects both of the original metaanalyses as well as the 7 This strategy of segregating exploratory studies is problematic, because it depends on the authors of individual studies for determinations of what constitutes relevant validity data. Validity evidence is validity evidence, regardless of whether an author made an a priori prediction or not. This strategy runs the risk of excluding relevant validity evidence, simply because an author failed to make a reasonable prediction; conversely, it runs the risk of including irrelevant or misleading evidence when study authors falsely claim to have made a priori predictions concerning post hoc discoveries. For all these reasons, lingering questions remain concerning the meta-analytic data bearing on Rorschach and MMPI validity. The present meta-analysis of criterion-related validity evidence for the Rorschach and the MMPI was undertaken in an effort to address some of the problems of the other meta-analyses. We used a random sample of studies from the MMPI and Rorschach literature published between 1977 and 1997, and we asked expert judges to select appropriate validity evidence from Rorschach and MMPI investigations, enabling us to include data from both exploratory and confirmatory studies. Furthermore, we conducted several moderator analyses to shed light on the circumstances under which Rorschach and MMPI variables might prove to have greater or lesser validity. Method Literature Search PsycLIT searches were used to identify potentially relevant studies published between January 1977 and December 1997. The start of this period was chosen in accordance with the focus of this Special Section on the research literature concerning the Rorschach and MMPI published since 1977; the end of the period reflected the most recent information available in PsycLIT at the time the literature search was conducted. MMPI articles were identified by searching for the terms "MMPI or (Minnesota and Multiphasic)" in article titles and abstracts; Rorschach articles were identified with the single search term "Rorschach." These searches yielded 4,378 MMPI and 1,793 Rorschach articles. 8 In addition to the published literature on the MMPI and Rorschach, we attempted to obtain unpublished studies in this area. Using letters, e-mail, phone, and fax, we attempted to contact 115 researchers who had presented research on the MMPI or the Rorschach at the Society for Personality Assessment between 1993 and 1997, asking them to send us unpublished studies conducted by them or by their colleagues. 9 Additionally, an appeal for unpublished studies was made on the SSCPnet, an e-mail discussion group sponsored by the Society for a Science of Clinical Psychology, which is monitored by many clinical psychology researchers. There was only one response to the message posted to the SSCPnet, from Gregory Meyer, the editor of the present Special Section. Altogether these efforts yielded two unpublished MMPI studies and eight unpublished Rorschach studies. Studies were selected for inclusion in the sample in a two-step procedure. First, Jordan B. Hiller screened the studies to determine whether they 4 It should be noted that this point has been disputed by Parker, Hunsley, and Hanson (in press). But see also the reply by Garb, Florio, and Grove (in press). 5 6 7 Two Rorschach experts independently coded the Rorschach studies, and the MMPI studies were likewise coded by 2 separate MMPI judges. 10 The judges were furnished only with synopses of the methodology of each study, so that their decisions would not be contaminated by the authors' predictions or by the results. The judges were asked to indicate whether each effect size for the relationship between a Rorschach or MMPI variable and a criterion variable constituted validity evidence-that is, whether an effect could reasonably be expected to be "significant," given the nature of the test, the sample, and the criterion variable. Reliability was computed for the first set of five studies considered by both pairs of judges. These studies contained 60 and 281 individual effect sizes for Rorschach and MMPI, respectively. Reliability calculations were made by considering each effect size as an individual observation, disregarding the nesting of effect sizes within studies. Interrater reliability was .35 for the Rorschach judges and .39 for the MMPI judges, as indexed by the (f> coefficient. Effective reliability for each pair of judges was .51 for the Rorschach and .57 for the MMPI, as calculated by the SpearmanBrown formula. Only effect sizes that both judges agreed were validity coefficients were extracted from the studies and used in the meta-analysis. Thus, studies evaluated by the judges were included in the meta-analysis as long as they contained at least one effect size that was deemed appropriate for inclusion by both judges. Studies for which judges did not agree about any effect sizes were excluded, as were studies that both judges agreed did not contain any appropriate effect sizes. Further information about the number of studies considered at each step of the selection procedure is contained in Coding Procedure For both Rorschach and MMPI studies, several variables reflecting study characteristics were coded as follows: (a) the year of publication; (b) a dichotomous variable reflecting whether the study appeared in one of five core journals that regularly publish research concerning these instruments (Assessment, Journal of Clinical Psychology, Journal of Consulting and Clinical Psychology, Journal of Personal Assessment, and Psychological Assessment) or in a different outlet; (c) a dichotomous variable indicating whether the study was included in the meta-analysis during the initial screening or later, after consideration by the judges; and (d) a code for whether the analytic method used in the study was t or F, Pearson's r, or some different method. Furthermore, a categorical variable with six levels was used to reflect the nature of the criterion variables used to validate MMPI or Rorschach measures (groups based on psychiatric diagnoses, objective outcomes such as suicide or hospitalization, ratings made by observers or judges, self-report questionnaires or scales, "projective" measures, or a combination of the preceding criterion types)." Rorschach studies were categorized by the different types of Rorschach predictors used in the studies (dichotomous signs such as the presence or absence of space responses, sums or ratios reflecting the absolute number or proportion of certain types of responses in a protocol, scales or composites of multiple elements, or a combination of several types of predictors). A dichotomous variable was used to code whether the Exner Comprehensive System or some other Rorschach coding system was used. Also, we noted whether the Rorschach predictors used in each study were based on structural features of responses (such as the shape or color of the inkblot area identified in the response), content of responses (i.e., characteristics of the percept itself, such as whether it is an animal, a type of food, a household item, etc.), or both. MMPI studies were categorized with a different scheme for predictor type (basic validity and clinical scales, supplemental or research scales, 2-or 3-point codetypes, or a combination of MMPI predictor types), and a dichotomous variable reflected whether the MMPI or MMPI-2 was used. Meta-Analytic Techniques Information extracted for meta-analytic calculations included validity coefficients (i.e., effect sizes), their significance levels, and the number of sampling units used in each study. When results were reported as "significant at p < .05" or "not significant," conservative estimates of significance and effect size were obtained by assuming one-tailed significance 10 Robert Bernstein and Mark Hilsenroth were the judges for Rorschach studies; David Berry and Radhika Krishnamurthy evaluated the MMPI studies. 1 ' In this article, we use the term "projective" in the historical sense, in reference to tests using ambiguous stimuli and eliciting open-ended responses. They are contrasted with self-report measures (historically known as "objective" instruments), which usually have a true-false or multiplechoice format. 282 fflLLER ET AL. levels of .05 or .50, respectively. When unfocused F or x 2 statistics were reported in the original studies, contrast analyses were conducted in order to extract meaningful effect sizes from focused comparisons. When more than one effect size was available within a study, the data were combined into a single estimate according to the methods described by 12 Effect size calculation was usually straightforward, but sometimes alternative procedures were necessary. On some occasions, test scores for a single group were compared to appropriate published norms, such as the MMPI-2 standardization sample The meta-analytic procedures used here are those described by 14 Two statistical models are commonly distinguished in meta-analytic work: fixed-effects models and random-effects models. The more common fixed-effects model effectively uses participants from the constituent studies in a meta-analysis as sampling units, whereas random-effects analyses use entire studies as sampling units. The chief interpretive difference between these two meta-analytic models concerns the population to which results may be generalized. For fixed-effects models, results are technically generalizable only to the populations examined in the particular studies entering the meta-analysis. Random-effects models allow generalization to populations of relevant studies (existing or hypothetical) that were not included in the meta-analysis. The price of the greater generalizability of the random-effects model is reduced statistical power. 15 There are good reasons for using each kind of analysis, but the random-effects approach may be particularly relevant for the current meta-analysis, given that the studies used here are truly a sample from a larger population of studies to which we would like to generalize. However, random-effects analyses are underpowered for some of the comparisons between smaller subsets of the sample. In this investigation, we generally use random-effects analyses for comparisons among sets of studies, and for the major analyses, fixedeffects calculations are presented as well. We note that although some sophisticated random-effects techniques have recently been developed Results Information about characteristics of individual studies is presented in Comparisons Between Published and Unpublished Studies Because the studies examined here are from two very different sources-a randomly selected sample of published studies and a small convenience sample of unpublished studies-the first analyses were conducted to examine the differences between published and unpublished studies. Only a small number of unpublished studies were available for these analyses, so fixed-effects comparisons were conducted to maximize their power. Effect sizes from unpublished Rorschach studies (f = .29) were practically identical in magnitude to those obtained from published studies (f = .29, Z = .13, p = .90, two-tailed). 16 However, the single unpublished MMPI study had an effect size (r = .74) that was greater than those from published MMPI studies (? = .30, Z = 5.88, p = 4 X 10~9, two-tailed). Given that unpublished studies are generally expected to have lower effect sizes than published ones, the unpublished MMPI study we were able to obtain was probably not representative of the population of unpublished studies. We therefore omitted both Rorschach and MMPI unpublished studies from the remaining analyses, recognizing that the results reported here are generalizable only to the population of published studies. Effect Sizes and Overall Significance