Twenty-five studies (n=3,027, range 26 to 791) were included. All studies provided details on participant recruitment (nine enrolled patients consecutively). Eighteen studies were blinded and used independent assessment of the reference and index tests. Seven studies included 100 or more patients. Attrition rates were not reported.
Global Attentiveness Rating (GAR) (one study), Memorial Delirium Assessment Scale (MDAS) (three studies), Confusion Assessment Method (CAM) (10 studies), Delirium Rating Scale Revised-98 (DRS-R-98) (two studies), Clinical Assessment of Confusion (CAC) (one study) and Delirium Observation Screening Scale (DOSS) (two studies) scales all had positive likelihood ratios greater than 5.0, which suggested a greater likelihood of disease. The corresponding negative likelihood ratios for GAR, MDAS, CAM, DRS-R-98, and DOSS were less than 0.2, which suggested a lesser likelihood of disease. Delirium Rating Scale (DRS) (four studies), Mini-Mental State Examination (MMSE) (one study), Nursing Delirium Screening Scale (NDSS) (one study) and Vigilance "A" Test (one study) showed negative likelihood ratios less than 0.2. The CAC study showed a negative likelihood ratio above 0.2 (LR 0.67, 95% CI 0.56 to 0.81).
MMSE was identified as the least useful for identification of patients with delirium (one study). Subgroup analyses showed that the CAM was the most useful scale that could be completed in five minutes or less by a nurse (positive LR 7.3, 95% CI 1.9 to 27 and negative LR 0.08, 95% CI 0.03 to 0.21) or physician (positive LR 19, 95% CI 6.7 to 51 and negative LR 0.19, 95 %CI 0.13 to 0.27).
There was evidence of statistical heterogeneity for CAM (positive and negative LR I2=65% and 85%), DOSS (I2=65% and 0%), DRS-R-98 (I2=73% and 0%) and MDAS (I2=85% and 69%). Analyses that included only studies in which the index test was performed by a physician resolved statistical heterogeneity for the negative likelihood ratio using CAM, but the positive likelihood ratio remained statistically significant. The authors reported that subanalysis by language or DSM version resolved statistical heterogeneity, but no other data were provided.