DMCA
A meta-analytic review of depression prevention programs for children and adolescents: Factors that predict magnitude of intervention effects. (2009)
Venue: | Journal of Consulting and Clinical Psychology, |
Citations: | 21 - 1 self |
BibTeX
@ARTICLE{Stice09ameta-analytic,
author = {Eric Stice and Heather Shaw and Cara Bohon and C Nathan Marti and Paul Rohde},
title = {A meta-analytic review of depression prevention programs for children and adolescents: Factors that predict magnitude of intervention effects.},
journal = {Journal of Consulting and Clinical Psychology,},
year = {2009},
pages = {486--503}
}
OpenURL
Abstract
In this meta-analytic review, the authors summarized the effects of depression prevention programs for youth as well as investigated participant, intervention, provider, and research design features associated with larger effects. They identified 47 trials that evaluated 32 prevention programs, producing 60 intervention effect sizes. The average effect for depressive symptoms from pre-to-posttreatment (r ϭ .15) and pretreatment to-follow-up (r ϭ .11) were small, but 13 (41%) prevention programs produced significant reductions in depressive symptoms and 4 (13%) produced significant reductions in risk for future depressive disorder onset relative to control groups. Larger effects emerged for programs targeting high-risk individuals, samples with more females, samples with older adolescents, programs with a shorter duration and with homework assignments, and programs delivered by professional interventionists. Intervention content (e.g., a focus on problem-solving training or reducing negative cognitions) and design features (e.g., use of random assignment and structured interviews) were unrelated to effect sizes. Results suggest that depression prevention efforts produce a higher yield if they incorporate factors associated with larger intervention effects (e.g., selective programs with a shorter duration that include homework). Keywords: depression prevention, adolescents, meta-analytic review Major depression is one of the most common psychiatric problems faced by adolescents, is marked by a recurrent course and elevated psychiatric comorbidity, and increases risk for future suicide attempts, academic failure, interpersonal problems, unemployment, and legal problems Although numerous trials of depression prevention programs have been conducted, the results of the findings have not been comprehensively reviewed and analyzed with meta-analytic procedures. In a recent meta-analytic review, Heather Shaw is now at the Oregon Research Institute. Preparation of this article was supported by a research grant MH 67183 from the National Institutes of Health. We thank Jane Gillham for her insightful and thoughtful comments on an earlier version of this article. Correspondence concerning this article should be addressed to Eric Stice, Oregon Research Institute, 1715 Franklin Blvd., Eugene, Oregon, 97403. E-mail: estice@ori.org Journal of Consulting and Clinical Psychology © 2009 American Psychological Association 2009, Vol. 77, No. 3, 486 -503 0022-006X/09/$12.00 DOI: 10.1037/a0015168 486 moderators of program effectiveness, and by conducting a formal evaluation of interrater agreement for abstracted information. Putative Moderators of Intervention Effects Examining moderators that predict magnitude of prevention program effects may identify aspects of the participants, interventions, providers, and research design associated with stronger effects. This information should increase the yield of future prevention efforts by identification of the conditions under which optimal prevention effects occur and the subgroups of individuals for whom alternative depression prevention programs need to be developed. These analyses may also advance theories regarding effective routes to reduce risk for depressive episodes and enhance the methodological rigor of trials. Thus, we investigated several potential moderators of intervention effects that were selected on the basis of theory, prior findings, and past literature reviews. Participant Features Participant risk status. Meta-analytic reviews have found that prevention programs often produce significantly stronger effects when interventions are offered to high-risk participants (selective and indicated prevention programs) versus all individuals in a population (universal prevention programs) for various outcomes, including depression Participant gender. We hypothesized that the effects for depression prevention programs would be larger for female versus male youth on the basis of evidence that adolescent girls report greater depressive symptoms and higher rates of major depression than adolescent boys Participant ethnicity. We hypothesized that depression prevention programs would produce larger effects for samples containing greater proportions of ethnic minority youth, as there is evidence that ethnic minority youth report more depressive symptoms than White youth Participant age. We theorized that children and early adolescent youth may find it more difficult to grasp the concepts and skills taught in the interventions than older adolescents Intervention Features Program content. Intervention content should influence whether a program produces effects Intervention duration. Meta-analyses of prevention programs for other problems revealed that longer interventions produced superior effects compared with very brief interventions Homework. Theoretically, prevention programs that include homework exercises relevant to the principles taught in the program should produce larger intervention effects than programs without homework. Clinicians have similarly posited that homework strengthens the impact of treatment for depression Provider Features: Professional Interventionists Researchers have suggested that prevention programs are more effective when delivered by dedicated professional interventionists versus classroom teachers (Baranowski, Cullen, Nicklas, Design Features Random assignment. Trials in which participants are randomly assigned to condition should produce larger intervention effects than trials in which alternative approaches are used to allocate participants to condition (e.g., matching) because it is the best approach to generating groups that are equivalent on potential confounds at baseline (with sufficiently large sample sizes), which should minimize the odds that any of these confounds are correlated with treatment condition and maximize the ability to detect intervention effects. Accordingly, we hypothesized that intervention effects may be greater for trials that used random assignment relative to other allocation approaches. However, because the proper analysis of intervention effects involves tests of differential change across conditions, which adjusts for any initial differences at baseline on the outcome, we suspected that this effect might not emerge. Indeed, random assignment did not emerge as a moderator of effects sizes in meta-analytic reviews of eating disorder Publication status. Numerous meta-analytic reviews have documented a file-drawer phenomena Incorrect unit of analysis. In many prevention trials, the classrooms or schools are the unit of random assignment to condition, but the data are analyzed as if the individual was the unit of randomization. This practice increases the risk for a false-positive finding because it artificially reduces the error term and increases the between-condition effect. The degrees of freedom for the test statistics are also artificially inflated, and the assumption of independent errors is violated. Therefore, we tested the hypothesis that trials in which the unit of random assignment was not equivalent to the unit of analysis would produce larger intervention effects than trials in which the unit of randomization and analyses matched. Follow-up duration. Effect sizes for prevention programs are typically strongest at posttest and become smaller at each subsequent follow-up assessment We were interested in additional moderators but were unable to include them for various reasons. We wanted to test whether effect sizes would be larger for programs that involved more extensive interventionist training and programs with higher session attendance and smaller for programs evaluated using blinded assessors, but reports did not contain sufficient detail for coding. Other moderators were not coded because they did not have sufficient variability, including whether (a) the intervention modality was individual or group (all were group), (b) the intervention had psychoeducational content (almost all included this content), (c) booster sessions were used (almost none used such sessions), (d) an intervention was interactive or didactic (almost all were interactive), and (e) the study outcome was assessed with validated measures (all included validated measures). Method Sample of Studies Five procedures were used to retrieve published and unpublished trials of depression prevention programs. First, a computer search was performed on PsychInfo, MedLine, and Dissertation Abstracts databases for the years 1980 -2008 with the following keywords: depression, depressive, prevention, preventive, and intervention. Two research assistants and a librarian performed independent searches. Eric Stice reviewed the products of all three searches to identify pertinent articles. Second, the tables of content for journals that commonly publish articles in this area were reviewed for this same period (e.g., Journal of Clinical and Consulting Psychology). Third, we consulted narrative reviews and prior meta-analytic reviews of the depression prevention field to search for additional citations. Fourth, the reference sections of all identified articles were examined. Finally, established depression prevention researchers were asked for copies of unpublished articles (under review or in press) describing prevention trials. STICE, SHAW, BOHON, MARTI, AND ROHDE Inclusion and Exclusion Criteria We focused exclusively on studies that included a continuous measure of depressive symptoms or conducted interviews assessing criteria for major depression. We also focused exclusively on trials that were conceptualized as depression prevention programs and did not include trials in which depressive symptoms were treated as a secondary outcome. If multiple reports of the same trial were published, we recorded effect sizes from all available followups. We focused on effect sizes testing for differential change in depressive symptoms because only nine trials tested whether the prevention program reduced the risk for onset of depression disorder among intervention participants relative to control participants. We included trials in which participants were randomly assigned to a depression prevention program or to an attention control condition, an assessment-only control condition, or a waitlist control condition. We also included trials in which some other relevant comparison group was used (e.g., matched controls) in a quasi-experimental design. We focused exclusively on studies that tested whether the change in the outcomes over time was significantly greater in the intervention group versus the control group. This could take the form of a Time ϫ Condition interaction in a repeated-measures analysis of variance (ANOVA) model, an analysis of covariance (ANCOVA) model that controlled for initial levels of the outcome variable, or growth curve model that controlled for initial levels of the outcome. We also included trials that used logistic regression or survival models to test whether the incidence of major depression onset was significantly lower in the intervention condition versus a control condition, provided initially depressed participants were excluded from the analyses. We restricted our focus to trials that targeted children and adolescents because of our interest in determining whether effective interventions have been designed for this developmental period. We believe that depression prevention programs should be implemented before most individuals are expected to show onset of their first major depression episode. We used a broad view of adolescence and included trials with a mean age of participants up to age 22 because this captured college-based depression prevention programs. Many developmental psychologists consider adolescence to span from approximately age 12 through age 24 (Arnett, 2000). Effect Size Estimation Procedures We calculated effect sizes for tests of differential change in depressive symptoms across the intervention and control conditions. However, if only the effect size for differential risk for onset of major depression across the conditions was available, that was used as the effect size. The correlation coefficient (r) was used as the index of effect size because of its similar interpretation across different combinations of interval, ordinal, and nominal variables (Pearson's r, Spearman's rho, and point biserial; Rosenthal, 1991) and because this effect size preserved the valence of the effects. Cohen's (1988) criteria for small (r ϭ .10), medium (r ϭ .30), and large (r ϭ .50) effects were used. If effect sizes were reported in Cohen's (1988) d, we converted them to r with the formula provided on Page 20 of Operationalization and Coding of Effect Size Moderators An iterative approach was taken to ensure reliable abstraction of moderators from the reports. First, Heather Shaw and Cara Bohon generated a coding system for the moderators on an a priori basis. Second, they coded a sample of 10 studies and then discussed and resolved all discrepancies, refining the coding system as necessary. Third, the remaining studies were then coded independently and reliability coefficients calculated. Finally, Heather Shaw and Cara Bohon held consensus meetings to resolve any remaining disagreements with regard to the coding of moderators. This final corrected data set was used for all analyses. Results Descriptive Statistics The literature search identified 46 trials that met the inclusion criteria, in which 32 different depression prevention programs 489 DEPRESSION PREVENTION PROGRAMS were evaluated (11 trials evaluated more than 1 program, and 9 programs were evaluated in 2-8 trials), resulting in a total of 60 effect sizes. We calculated interrater agreement between the two moderator coders for all trials included in this review (see Average Effect Size and Effect Size Heterogeneity A Statistical Analysis System (SAS Institute, Cary, NC) macro that computed inverse variance-weighted average effect sizes for The r values for posttest effect sizes ranged from Ϫ.47 to .68. There was significant heterogeneity in effect sizes at posttest (Q ϭ 528.76, p Ͻ .001), indicating variability across effect sizes. The average follow-up effect size across all studies (M r ϭ .11) was significantly larger than zero (z ϭ 6.40, p Ͻ .001). The r values for follow-up effect sizes ranged from Ϫ.18 to .76. There was also significant heterogeneity in effect sizes at follow-up (Q ϭ 145.69, p Ͻ .001). Relations of Moderators to Observed Effects Sizes Moderator analyses were conducted using inverse varianceweighted random-effects regression models. Random-effects models separate the overall variability in observed effect sizes from the within-intervention variance. If studies are treated as a source of random variability, random effects models can be generalized to a broader set of studies or potential studies. Regression models with maximum likelihood estimation were conducted using a SAS macro written for meta-analysis Moderators were examined individually in regression models to investigate the univariate relations between moderators and effect sizes. Although some meta-analyses have used multivariate approaches that test whether each moderator shows a unique relation to effect sizes statistically controlling for the other moderators (Perepletchikov, Treat, & Kazdin, 2007; The four continuous moderators-percentage of females, percentage of Whites, average age, and intervention duration-were standardized in a z score format. We tested for linear and quadratic effects for the continuous moderators to decrease the risk of model misspecification Results for all univariate models are presented in 1 Risk status of participants was also a significant predictor of effect sizes from follow-up assessments: selective trials exhibited a moderate average effect size (M r ϭ .14, p Ͻ .001, n ϭ 28), but universally implemented programs exhibited a small average effect size (M r ϭ .06, p Ͻ .001, n ϭ 21), though both effects differed significantly from zero. The percentage of the participants who were female in the trials was significantly related to effects sizes. 2 At posttest, interventions below the median (Ն 53% females) exhibited a small nonsignificant average effect size (M r ϭ .05, p ϭ ns, n ϭ 26), whereas the average effect for interventions at or above the median was moderate and significant (M r ϭ .22, p Ͻ .001, n ϭ 32). A similar effect was observed with effect sizes from follow-ups: interventions below the median exhibited a small average effect size that was significant (M r ϭ .09, p Ͻ .001, n ϭ 21) and interventions at or above the median showed larger effects (M r ϭ .12, p Ͻ .001, n ϭ 27). Percentage of White participants exhibited a quadratic effect at posttest. Probing this pattern with tertile splits revealed that effects were similar for the lowest tertile, which was less than 55% Whites (M r ϭ .24, p Ͻ .001, n ϭ 11), and the middle tertile, which was between 55% and 83% Whites (M r ϭ .25, p Ͻ .001, n ϭ 13), but effect sizes were trivial and nonsignificant for interventions containing greater than 83% White participants (M r ϭ .04, p ϭ ns, n ϭ 11). Participant age was a significant predictor of effect size at posttest; 1 We also compared selective versus indicated programs to ensure that it was reasonable to combine these two types of programs. There were no differences between selective and indicated programs at posttest (z ϭ Ϫ.69, p ϭ .49) or at follow-up (z ϭ 1.60, p ϭ .11). 2