Twenty-eight studies were included: 21 studies examined major depressive disorder (n=9,293) and 7 studies examined major depressive disorder or dysthymia (n=2,609).
Major depressive disorder.
The median LR for positive tests was 3.3 (range: 2.3 to 12.2), suggesting that a positive depression screen is over 3 times more likely to be seen in someone with major depressive disorder than in someone without. The median LR for negative tests was 0.19 (range: 0.14 to 0.35), suggesting that a negative depression screen was 0.2 times as likely to be seen in someone with major depressive disorder than in someone without.
Major depressive disorder or dysthymia.
The median LR across all instruments for positive results was 3.9 (range: 2.27 to 5.19). The median LR for negative tests was 0.3 (range: 0.05 to 0.53).
Heterogeneity.
Statistically significant differences in effectiveness scores between instruments were shown for a number of different questionnaires (BDI, CES-D, HSCL and SDS), suggesting that the instruments performed variably across the individual studies. Performance did not differ significantly between instruments.
Reproducibility of the reference standard.
Semi-structured clinical interviews (n=7): inter-rater reliability, as measured by the kappa statistic, ranged from 0.64 to 0.93, representing good to excellent agreement.
Non-structured clinical interviews (n=7): inter-rater reliability, as measured by the kappa statistic, ranged from 0.55 to 0.74, representing fair to good agreement.