|Systematic review and meta-analysis of the diagnostic accuracy of ultrasonography for deep vein thrombosis
|Goodacre S, Sampson F, Thomas S, van Beek E, Sutton A
This review assessed the ability of ultrasound investigations to diagnose suspected deep vein thrombosis (DVT), and investigated reasons for differing results between studies. The authors concluded that duplex and triplex ultrasound are best at ruling out DVT in high-risk patients, whilst compression ultrasound is best at ruling in DVT in low-risk patients. The review was well conducted and reported and its conclusions are likely to be reliable.
To estimate the diagnostic accuracy of ultrasound (US) for suspected deep vein thrombosis (DVT); to investigate sources of between-study heterogeneity using meta-regression; and to seek evidence of publication bias.
MEDLINE, EMBASE, CINAHL, Web of Science, the Cochrane Controlled Trials Register, the Cochrane Database of Systematic Reviews, DARE and ACP Journal Club were searched from inception to April 2004. In addition, the bibliographies of included studies were screened for further relevant articles. Studies published in English, French, Spanish, Italian or German were included.
Study designs of evaluations included in the review
Diagnostic cohort studies that included at least 10 patients were eligible for inclusion. Diagnostic case-control studies, in which US results for a group of patients with DVT were compared with those of a control group of patients without DVT, were specifically excluded.
Specific interventions included in the review
Studies using US as the index text were eligible for inclusion. The included studies used compression ultrasonography, colour Doppler, continuous wave Doppler, triplex, duplex and other (not reported) US techniques. Interpretation of the US examination by a radiologist and interpretation by a sonographer were included as variables in the meta-regression. Studies that assessed repeat US but did not perform the reference standard in all patients were recorded and analysed separately.
Reference standard test against which the new test was compared
Studies using contrast venography as the reference standard of diagnosis were eligible for inclusion.
Participants included in the review
Studies of patients with clinically suspected DVT were eligible for inclusion. Studies of patients with suspected pulmonary embolous were excluded. Where reported, the mean or median age of the participants ranged from 39 to 68 years (median 57), the proportion of males ranged from 15 to 95% (median 45), and the prevalence of DVT ranged from 20 to 94% (median 48) with a proximal proportion of 48 to 100% (median 78). The cohorts were recruited from out-patient clinics, in-patients, emergency departments and mixed settings.
Outcomes assessed in the review
No inclusion criteria for the outcome measures were specified. The authors stated that they extracted 2x2 data from the included studies.
How were decisions on the relevance of primary studies made?
Two reviewers independently screened titles and abstracts to identify potentially relevant articles. Full copies of all selected articles were retrieved. The same two reviewers then independently selected articles for inclusion. Kappa scores were calculated to measure agreement at both stages and any disagreements were resolved by consensus.
Assessment of study quality
Two reviewers independently assessed study quality using the following criteria: whether recruitment was consecutive and/or data collection prospective; measurement of US blind to the result of the reference standard; and conduct of the reference standard blind to the result of US testing. Any disagreements were resolved by an independent reviewer.
Two reviewers independently extracted the data using a standardised form. A third reviewer checked and resolved any discrepancies. Data were extracted on study and patient characteristics, the US technique used and operator, true positives (proximal and distal), true negatives, false positives and false negatives (proximal and distal). Where it was not possible to extract the necessary data from the published report, authors were contacted for clarification.
Methods of synthesis
How were the studies combined?
For single US examinations, random-effects models were used to estimate the overall sensitivities and specificities with 95% confidence intervals (CIs). The results of individual included studies, as well as pooled estimates, were presented as forest plots. Where zero values occurred in the study data, a continuity correction of 0.5 was used. Analyses were conducted using MetaDiSc software.
Studies of serial US were summarised in the text and tables.
Publication bias was assessed using funnel plots of the log odds of sensitivity and specificity against their corresponding standard errors.
How were differences between studies investigated?
Between-study heterogeneity was assessed using a chi-squared test. Initially, all studies were pooled and meta-regression was used to identify potential causes of heterogeneity in the sensitivity and specificity separately. Covariates included participant characteristics, study characteristics and aspects of methodological quality. Any covariate that showed an association with either sensitivity or specificity (p<0.1) was used to define subgroups for separate meta-analyses. In addition, subgroup analyses by US technique were specified a priori.
Results of the review
A total of 99 studies (10,323 participants), with 100 cohorts, were included in the meta-analysis. A further 9 studies (not included in the meta-analysis), which assessed serial US, were included in the review.
Forty-eight studies reported consecutive recruitment; 67 studies were prospective. US was interpreted blind to the results of venography in 62 cohorts and venography was interpreted blind to the results of US in 56 cohorts.
The sensitivity ranged from 44 to 100%; the pooled sensitivity was 89.7% (95% CI: 88.8, 90.5). The specificity ranged from 25 to 100%; the pooled specificity was 93.8% (95% CI: 93.1, 94.4). There was significant between-study heterogeneity in both parameters (p<0.001).
A number of covariates showed significant association with variation in sensitivity and/or specificity. More recent studies, studies with a higher prevalence of DVT, and studies with a higher proportion of proximal DVT tended to have higher sensitivity. Studies that excluded patients with a previous DVT tended to have higher specificity. Studies in which the US operator was a radiologist tended to have worse diagnostic performance.
Estimates of sensitivity and specificity stratified by US technique were reported in full. The use of duplex or triplex US optimised sensitivity, while compression US optimised specificity. For the detection of proximal DVT alone (72 studies), the pooled estimate for the sensitivity of US was 94.2% (95% CI: 93.2, 95.0), while for distal DVT alone (56 studies) it was 63.5% (95% CI: 59.8, 67.0); there was significant between-study heterogeneity in both parameters (p<0.001).
In unselected cohorts repeat US scanning had a positive yield of 0 to 2% (35 out of 2,610 overall). Where repeat scanning was restricted by clinical probability or the results of a D-dimer test (2 studies), the overall positive yield increased to 22 out of 606.
The funnel plots appeared asymmetrical for both sensitivity and specificity, which suggests that smaller studies tend to report higher sensitivities and specificities. One possible explanation for this would be that smaller studies reporting lower values are less likely to be published.
US has high sensitivity for proximal DVT and moderate sensitivity for distal DVT. Optimal sensitivity is achieved using duplex or triplex US, and optimal specificity achieved using compression US. Sensitivity appears to be higher where there is a higher prevalence of DVT. All findings were subject to substantial unexplained heterogeneity and should be interpreted with caution. The potential benefits of repeat or serial US are uncertain.
This was a well-conducted and clearly reported review. It addressed a number of clearly stated objectives and appropriate inclusion criteria were defined. Extensive searches of the published literature were made, although language restrictions might have resulted in the loss of some relevant data. Publication bias was assessed and the authors discussed the implications of this assessment, though it should be noted that the use of funnel plots of this type to assess publication bias in diagnostic accuracy studies is a subject of debate and is thought to generate unreliable results. The review methodology was rigorous and clearly reported, and included appropriate measures to minimise error and bias. The methodological quality of the included studies was assessed using a limited number of criteria appropriate to diagnostic accuracy studies, and these were included as covariates in the meta-regression (an appropriate method of investigating the impact of study quality upon estimates of diagnostic accuracy).
The statistical methods used to generate estimates of diagnostic accuracy and to investigate between-study heterogeneity were appropriate and clearly reported. The value of pooled estimates of diagnostic accuracy is limited by the presence of significant residual heterogeneity, even within subgroups, which the authors highlighted. The authors discussed fully the limitations of the available data and its implications for their study. The authors' conclusions follow broadly from the data presented and are likely to be reliable.
Implications of the review for practice and research
Practice: The results of the review suggest that compression US is the most appropriate technique for most patients, where scanning is aimed at identifying proximal DVT; most patients have a low probability of DVT, so optimal specificity is needed to minimise false positives. In patients at high risk of DVT, or where scanning aims to detect distal DVT, duplex or triplex US is likely to be more appropriate.
Research: The authors made no specific recommendations for future research.
Goodacre S, Sampson F, Thomas S, van Beek E, Sutton A. Systematic review and meta-analysis of the diagnostic accuracy of ultrasonography for deep vein thrombosis. BMC Medical Imaging 2005; 5:6
Other publications of related interest
Goodacre S, Sampson FC, Sutton AJ, Mason S, Morris F. Variation in the diagnostic performance of D-dimer for suspected deep vein thrombosis. QJM 2005;98:513-27.
Subject indexing assigned by NLM
Database entry date
This is a critical abstract of a systematic review that meets the criteria for inclusion on DARE. Each critical abstract contains a brief summary of the review methods, results and conclusions followed by a detailed critical assessment on the reliability of the review and the conclusions drawn.