There were 106 studies (172 comparisons) included in the review. The mean number of participants was 63.4 (range 10 to 374), suggesting that the total number of participants in the study was approximately 6720.
Effect size estimates from different types of comparisons: Unless otherwise stated, studies within comparison groups were homogeneous. For studies comparing combination treatment to somatic (or standard) treatment alone (n=71) the weighted least squares (WLS) average of the effect sizes was d= 0.39 (95% CI: 0.32, 0.44). Effect sizes were heterogeneous (QW= 172.28, p<0.001).
For psychosocial treatment plus somatic (or standard) treatment versus no treatment (n=5), d = 0.85 (95% CI: 0.62, 1.09). Effect sizes were heterogeneous (QW= 12.28, p= 0.015).
For psychosocial treatment plus somatic (or standard) treatment versus psychosocial treatment alone (n=5), d= 0.27 (95% CI: 0.03, 0.51).
For psychosocial treatment versus no treatment (n=6), d = 0.37 (95% CI: 0.19, 0.55).
For psychosocial treatment alone versus somatic treatment alone (n=3), d = -0.06 (95% CI: -0.32, 0.21). Effect sizes were heterogeneous (QW= 12.93, p= 0.002).
Effect of psychosocial treatment on relapse (14 studies): The relapse frequencies for patients who received psychosocial treatment in addition to somatic (or standard) treatment were consistently lower than for patients who received only somatic (or standard) treatment. The relapse frequencies for the psychosocial treatment groups were, on average and after weighting for sample size, 20% lower than that for the control groups.
Moderator variables: More recent studies tended to produce larger effect sizes before, but not after, adjustment for multiple testing. Studies with larger sample sizes produced smaller effect sizes (r=-0.38, df = 68, p=0.0013), an association statistically significant before and after adjustment for multiple testing. Random assignment, manualization, equal attrition rates and use of structured interview in diagnosis did not have any reliable effects on the outcome of comparisons. The effect of patient expectation could not be examined, because none of the studies reported this variable.
Effect sizes from those studies in which the authors had a clear allegiance to the experimental treatment were larger than those in which the allegiance was not clear. This effect was statistically significant before adjustment for multiple testing, but not after. The impact of source (e.g. self rated vs other-rated vs objective measures) and context (e.g. negative symptoms, behavioural disorganisation) could not be assessed. Time since onset of illness was the only patient variable which had a statistically significant effect before and after adjustment for multiple testing (r=0.63, df =30, p<0.001). Other factors such as patients' gender, age, marital status, education, IQ score, alcohol/drug abuse, and even previous hospitalisation did not have any reliable effect.
Studies that had used formal criteria in diagnosis of patients produced larger effect sizes (QB = 18.15, df = 1, p<0.001) before and after adjustment for multiple testing. Classifying the diagnostic criteria as reflecting either a narrow or broad definition of schizophrenia had no reliable effects. Similarly, the distinction between paranoid and nonparanoid subtypes was unrelated to treatment outcome. Studies from non-Western countries (six from China and two from Israel) tended to produce higher effect sizes, while studies from Scandinavian countries and the United States and Canada tended to produce lower effect sizes compared with studies from Great Britain and Continental Europe (QB = 51.40, df = 4, p<0.001). This finding was significant before and after adjustment for multiple testing. Duration of treatment had a statistically significant impact on the results (R= 0.48, df = 41, p = 0.0013), with and without adjustment for multiple testing. This effect was not present after the removal of an outlier.
Impact of modality: There was a statistically significant difference between effect sizes for six basic modalities (individual, group, family, milieu, occupational/recreational, and community care) (QB = 11.7, df =5, p<0.05). Studies reporting on the effects of group therapy produced the smallest effect sizes. When these studies were removed from the sample, there were no differences between estimates from the other five modalities.
Impact of Orientation: There were no statistically significant differences between effect size estimates from three broad theoretical orientations: behavioural, "verbal" therapies, and cognitive training.
Posttreatment versus followup (10 studies): Effect sizes were d = 0.38 (95% CI: 0.32, 0.44) for posttreatment and d = 0.42 (95% CI: 0.24, 0.59) for follow up. Multiple regression analysis: Two conclusions were drawn from the regression analyses. First, a large amount of the variation in the effect sizes from this heterogeneous group of studies was explainable by a small number of the variables chosen. Second, studies from Scandinavian countries and studies using measures of disorganised behaviour and possibly measures of employment tend to produce smaller effect sizes. Studies from non-Western countries, studies with more chronic patients, and possibly studies using objective diagnostic criteria tend to produce larger effect sizes.