Thirty-six studies (72,315 women) were included. Twenty-nine of these studies fulfilled criteria for inclusion in the meta-analysis (number of women unclear).
The studies were generally of a low methodological quality. The median number of QUADAS items fulfilled was 5 (range: 2 to 9) out of 12. Only 6 studies included a clinically representative sample of women (unselected women recruited from the community or primary care). Only 4 studies collected data prospectively to evaluate the OST. None of the studies provided explicit information about blinding. All studies reported that at least 90% of women received verification by DXA.
Femoral neck T =< -2.5 in whites (7 studies, 43,031 women): the sensitivity ranged from 88 to 96% and the specificity from 30 to 71%. The pooled LR- was 0.19 (95% confidence interval, CI: 0.17, 0.21), suggesting reasonable performance in ruling out low BMD. There was low heterogeneity (I-squared 7%).
Femoral neck T =< -2.5 in Asians (12 studies, 8,366 women): the sensitivity ranged from 70 to 99% and the specificity from 29 to 73%. The pooled LR- was 0.19 (95% CI: 0.14, 0.28), suggesting reasonable performance in ruling out low BMD. There was substantial heterogeneity (I-squared 64%).
Lumbar spine T =< -2.5 in whites (5 studies, 15,032 women): the sensitivity ranged from 51 to 97% and the specificity from 34 to 72%. The pooled LR- was 0.43 (95% CI: 0.31, 0.59), suggesting poor performance in ruling out low BMD. There was substantial heterogeneity (I-squared 87%).
Lumbar spine T =< -2.5 in Asians (5 studies, 2,744 women): the sensitivity ranged from 78 to 81% and the specificity from 56 to 75%. The pooled LR- was 0.32 (95% CI: 0.28, 0.38), suggesting poor performance in ruling out low BMD. There was negligible heterogeneity (I-squared 0%).
The accuracy of the OST for ruling out a T-score of less than -2.0 was only investigated in white women and was poor regardless of anatomic region (pooled LR- ranged from 0.28 to 0.48), with high levels of between-study heterogeneity (I-squared >=68%).
Heterogeneity was investigated in evaluations for femoral neck T-score of less than -2.5. For studies in white women, only the QUADAS item relating to time between the OST and DXA was associated with LR- (p=0.03). None of the items investigated showed an association with LR- in studies in Asian women.
There was no evidence of asymmetry in the funnel plots, suggesting an absence of small study effects.
For femoral neck, lumbar spine and any region T =< -2.5 in whites, the SROC curve for all studies regardless of the cut-off was very similar to the SROC curve generated for studies using a cut-off of >1.