Thirty studies that assessed 9,784 lesions were included in the review. Numbers of participants were not reported for most studies. Eighteen studies assessed dermoscopy, seven studies assessed digital dermoscopy ⁄artificial intelligence and five studies evaluated both. The most common algorithm used in dermoscopy studies was pattern analysis (10 studies).
Studies included in the review examined 8,045 lesions assessed using dermoscopy and 2,420 lesions assessed using artificial intelligence.
The pooled estimate of sensitivity for dermoscopy (any algorithm) was 0.88 (95% CI 0.87 to 0.89) and the pooled estimate of specificity was 0.86 (95% CI 0.85 to 0.86) from 23 studies (30 data sets).
The pooled estimate of sensitivity for artificial intelligence was 0.91 (95% CI 0.88 to 0.93) and the pooled estimate of specificity was 0.79 (95% CI 0.77 to 0.81) from 12 studies.
Sensitivity and specificity were calculated per lesion.
Pooled sensitivity for artificial intelligence was slightly higher than for dermoscopy (p=0.076) (not significant) and pooled specificity for dermoscopy was significantly higher than artificial intelligence (p<0.001). For individual dermoscopy algorithms, pattern analysis had significantly lower sensitivity than seven features for melanoma (7FFM), Menzies score and artificial intelligence. Pattern analysis showed significantly higher specificity than ABCD, ABCDE, seven-point checklist and artificial intelligence. ABCD had significantly lower specificity than 7FFM and three-point checklist. Artificial intelligence had significantly lower specificity than 7FFM.
The pooled estimates of diagnostic odds ratio were 51.52 (95% CI 38.02 to 69.82) for dermoscopy and 57.83 (95% CI 26.95 to 124.08) for artificial intelligence (no significant difference). There were no significance differences in diagnostic odds ratio between diagnostic algorithms.
Funnel plots showed slight evidence of publication bias, which was considered unlikely to have a major effect on results of the meta-analyses.