Twenty-eight RCTs were included in the review. There were 2,239 children whose behaviour was the reason for attendance (sample sizes ranged from 24 to 305). Only one study was considered to be free from bias; another five were at risk of bias on only one criterion. Most studies did not fully document randomisation and allocation concealment procedures or the level of attention given to participants in the control groups. Intention-to-treat analyses were not used in most studies. Only three RCTs satisfied all four of the external validity criteria and seven met three criteria; two met none of them and the rest scored one (eight RCTs) or two (eight RCTs).
There were significant reductions in child disruptive behaviour on all the outcomes assessed. The weighted mean difference for the ECBI Intensity scale was -20.90 (95% CI -26.26 to -15.53; Ι²=71.1%; 24 RCTs) and for the ECBI Problem scale it was -6.03 (95% CI -7.70 to -4.36; Ι² =71.1%; 17 RCTs). The CBCL Externalizing scale showed a weighted mean difference of -3.66 (95% CI -6.28 to -1.04; Ι²=26.9%; five RCTs) and the SDQ Conduct scale showed a weighted mean difference of -0.59 (95% CI -0.88 to -0.29; Ι²=0%; six RCTs).
There was no significant relationship between total practice score and the effect size on either of the ECBI subscales. Subgroup analyses found a significant relationship between a positive score on the routine service criterion and effect size but no other interactions between external validity measures and effectiveness.
Analysis of contour-enhanced funnel plots for the ECBI Intensity and Problem scales suggested the existence of publication bias.