Fifty-seven studies were included in the review. It was unclear how many participants were recruited in these studies.
Inter-observer agreement was high. For study eligibility it was 96% (kappa 0.91) and for the various components of study quality it ranged from 89 to 100% (kappa: 0.64 to 1.0).
The majority of the studies did not score highly on methodological quality. Most did not report whether participant recruitment was consecutive. Less than half of the studies included patient spectrum (with respect to HRT use). The diagnostic test was quite well described, but over half of the studies using a cut-off of less than or equal to 4 mm or less than or equal to 5 mm determined this cut-off point post hoc. Few studies were blinded. However, most did verify the diagnosis, and most described what the review authors considered to be an ideal reference standard.
The commonest cut-off levels used in the diagnostic tests were based on the measurement of both endometrial layers. These cut-offs were less than or equal to 4 mm (9 studies for endometrial carcinoma; 9 studies for endometrial disease) and less than or equal to 5 mm (21 studies for endometrial carcinoma; 19 studies for endometrial disease); hereafter referred to as cut-off A and cut-off B, respectively.
The pre-test probability of endometrial carcinoma was 14% (95% CI: 13.3, 14.7). A negative test result reduced the post-test probability of carcinoma to 1.2% (95% CI: 0.4, 2.9) at cut-off A and 2.3% (95% CI: 1.2, 4.8) at cut-off B. Conversely, a positive test result increased the post-test probabilities of carcinoma to 24.2% (95% CI: 19.7, 29.2) and 26.1% (95% CI: 21.1, 31.6), respectively. The LR estimates from the cut-off A studies did not show evidence of significant heterogeneity, although none of these studies were of a good quality. The pooled estimates of LRs for cut-off B studies were heterogeneous, and sensitivity analyses showed no explanation. When the analysis was restricted to just the four best-quality studies, the negative test result reduced the post-test probability of carcinoma to 2.5% (95% CI: 0.9, 6.4).
The pre-test probability of endometrial disease was 26% (95% CI: 25, 27). A negative test resulted in a post-test probability of disease of 2.4% (95% CI: 1.3, 3.9) at cut-off A and 5% (95% CI: 2.9, 9.1) at cut-off B. A positive test result increased the post-test probabilities of disease to 43.3% (95% CI: 36.6, 46.7) and 47.9% (95% CI: 40.4, 55.6), respectively. The LR estimates from the cut-off A studies did not show evidence of significant heterogeneity, but none of these studies were of a good quality. Again, heterogeneity in the result for cut-off B studies could not be explained by sensitivity analyses. Using the pooled estimate from the four best-quality studies only, a negative test result reduced the post-test probability of disease to 2.7% (95% CI: 0.9, 6.9).
Further analyses were reported in the paper. The additional analysis published as a separate report assessed the effects of delayed verification. This analysis was restricted to the 15 included studies that included a reference standard examination obtained by an independent endometrial sampling technique and that provided explicit information on the time between the index test and reference standard. The pooled diagnostic odds ratio for studies that reported immediate verification (<=24 hours between tests) was 30.6 (95% CI: 9.1, 102.6), compared with 15.6 (95% CI: 7.1, 34.1) for studies that reported delayed verification (>24 hours between tests). Sensitivity ranged from 88 to 100% in studies that reported immediate verification and from 67 to 100% in those that reported delayed verification. Specificity ranged from 31 to 83% in studies that reported immediate verification and from 39 to 77% among those that reported delayed verification.