There were 10 studies investigating single reading plus with computer aids. Sample sizes ranged from 5,016 to 116,086. There were five studies using a matched design, four studies unmatched, and one study using both types of data.
There were 17 matched studies investigating double reading. Sample sizes ranged from 5,659 to 257,212. There were eight studies using arbitration/consensus, six studies using unilateral decisions and three studies using a mix of these.
Single reading plus computer aids compared with single reading: There was no statistically significant difference in cancer detection rates between the two methods, based on all ten studies (OR 1.04, 95% CI 0.96 to 1.13), or when they were subgrouped by study design. There was a statistically significant increase in the recall rate when single reading plus computer aids was used compared with single reading alone, based on the overall pooling (OR 1.10, 95% CI 1.09 to 1.12) and when studies were subgrouped by design. There was evidence of heterogeneity for the overall pooling of recall rate and the unmatched studies, but not the matched studies which provided a similar estimate to the overall pooling. The results were similar when a single study using different methods was excluded.
Double reading compared with single reading: There was a statistically significant (p<0.001) improvement in cancer detection rates with double reading (OR 1.10, 95% CI 1.06 to 1.14), based on all 17 studies. The authors stated that the subgroup analyses were 'mostly non significant'. The confidence intervals suggested a potential statistically significant improvement with both the arbitration/consensus (OR 1.08, 95% CI 1.02 to 1.15) and the unilateral (OR 1.13, 95% CI: 1.06, 1.19) double reading. There was a statistically significant increase in recall rate in the overall pooling (OR 1.17 95% CI 1.15 to 1.18) and with unilateral and mixed decision methods, but not with arbitration/consensus where there was a decreased recall rate with double reading. There was strong evidence of statistical heterogeneity in the overall pooling, as well as between and within each of the subgroups. This heterogeneity remained when a single arbitration/consensus study with a different method was excluded. A random-effects model was used for the arbitration consensus subgroup, and found a reduced recall rate with arbitration, but this was not statistically significant (OR 0.87, 95% CI 0.75 to 1.02)