Ultrasonography to evaluate adults for appendicitis: decision making based on meta-analysis and probabilistic reasoning


Ultrasonography to evaluate adults for appendicitis: decision making based on meta-analysis and probabilistic reasoning
Orr R K, Porter D, Hartman D


Authors' objectives To assess the performance of ultrasonography (US), and to develop recommendations for its use in the evaluation of potential appendicitis. Searching MEDLINE was searched from 1986 to 1994 for English language reports using the following subject headings and keywords: 'ultrasound', 'ultrasonography', 'sonography', 'appendix' and 'appendicitis'. The citation lists of retrieved literature were searched for additional material, and recent journals in radiology, surgery and emergency medicine were examined. Unpublished data were not sought and the individual authors were not contacted about their data. Study selection Study designs of evaluations included in the review No inclusion criteria relating to the study design were specified. The included studies were diagnostic cohorts. Specific interventions included in the review Studies of graded compression US in the diagnosis of appendicitis were eligible for inclusion. Reference standard test against which the new test was compared No inclusion criteria relating to the reference standard were specified. The means of establishing the diagnosis used in the included studies were not reported. Participants included in the review Studies of adults and adolescents suspected of having appendicitis were eligible for inclusion. Studies reporting paediatric data only were excluded. The mean age of the participants in the included studies ranged from 26 to 46 years, and 28 to 66% were female. Outcomes assessed in the review The included studies were required to report sufficient data to populate 2x2 tables. The outcome measures calculated in the review were sensitivity, specificity, overall diagnostic accuracy, and the positive and negative predictive values (PPV and NPV, respectively). How were decisions on the relevance of primary studies made? The authors did not state how the papers were selected for the review, or how many reviewers performed the selection. Assessment of study quality The authors did not state that they assessed validity. Data extraction The authors did not state how the data were extracted for the review, or how many reviewers performed the data extraction. Methods of synthesis How were the studies combined? Summary point estimates of sensitivity and specificity were calculated using a random-effects model. A receiver operator characteristic was considered inappropriate since there was no correlation between specificity and sensitivity (Pearson's r- squared 0.03, P not significant). To simulate real-world decision- making and to determine the true usefulness of US, probability calculations were performed for three hypothetical patient groups using the mean specificity and sensitivity calculated. Group 1 comprised patients with definite signs of appendicitis who required urgent surgery; group 2 comprised patients with intermediate signs who required serial observation; and group 3 comprised patients with a low probability of appendicitis, who were usually released home. How were differences between studies investigated? Fisher's test for homogeneity was employed. The data for specificity and sensitivity were found to be heterogeneous (results not given). Univariate analyses of potentially confounding factors were performed. Sensitivity or specificity was used as the dependent variable; the independent variables were mean age, gender proportion, prevalence of appendicitis, publication year, and authorship (radiologist versus nonradiologist). Linear regression was used for continuous variables, while a Mann-Whitney test was applied for categorical variables. Results of the review Seventeen studies involving 3,358 patients, of whom 1,247 had appendicitis, were included in the meta-analysis. The pooled sensitivity of US was 84.7% (95% confidence interval, CI: 81.0, 87.8) and the pooled specificity was 92.1% (95% CI: 88.0, 95.2). An accurate diagnosis was calculated to occur in 85.9, 88.9 and 91.8% of cases in groups 1, 2 and 3, respectively. PPVs and NPVs were calculated for each risk group. The PPV was 97.6% for group 1, 87.3% for group 2 and 19.8% for group 3. The NPV was 59.5% for group 1, 89.9% for group 2 and 99.7% for group 3. Sensitivity and specificity failed to correlate with the number of patients in individual studies, gender proportion, average age, or prevalence of appendicitis. Authors' conclusions US should not be used to excluded appendicitis for patients with classic signs of the illness (group 1) due to the high false- negative rate. In patients with intermediate signs of appendicitis (group 2), a positive US result should indicate the necessity of an operation, or extended observation. The high false-positive rate in those patients who have a low probability of appendicitis (group 3) suggests that US screening is not to be recommended. CRD commentary The review question was clear, but the inclusion criteria were poorly defined. In particular, the methods used to establish a diagnosis were not defined. The search strategy was limited and was restricted to English language publications; relevant published articles may have been missed. Further, the authors stated that they neither attempted to identify unpublished studies nor assessed the impact of publication bias on their review. The methodological quality of the included studies was not assessed and the review methodology was poorly reported. The potential impact of bias introduced by methodological flaws, in either the primary studies or the review itself, cannot therefore, be assessed. Limited details of the included studies were reported, so it is difficult to assess the general applicability of the review. It should also be noted that one study was excluded from the review on the grounds that it had a small sample size (n=21) and widely discordant results with those of the other included studies (US sensitivity 52.9%; specificity 100%). This is not a valid reason for exclusion. The authors could have performed a sensitivity analysis, which would have given more weight to their argument. The data analysis was reasonable and well described. The authors' conclusions follow broadly from the data presented, but should be viewed with caution given the limitations outlined. Implications of the review for practice and research Practice: The authors stated that US should not be used toexcluded appendicitis for patients with classic signs of the illness (group 1) due to the high false-negative rate. In patients with intermediate signs of appendicitis (group 2), a positive US result should indicate the necessity of an operation, or extended observation. The high false-positive rate in those patients who have a low probability of appendicitis (group 3) suggested that US screening is not to be recommended. Research: The authors stated that research should address the usefulness of US in children; compare US with computed tomography; include prospective emergency department-based studies where US is interpreted by emergency physicians, and where specific subgroups are considered. In addition, it should also address the cost- effectiveness of US, and its actual impact in terms of patient care. Bibliographic details Orr R K, Porter D, Hartman D. Ultrasonography to evaluate adults for appendicitis: decision making based on meta-analysis and probabilistic reasoning. Academic Emergency Medicine 1995; 2(7): 644-650 PubMedID 8521213 Indexing Status Subject indexing assigned by NLM MeSH Adult; Appendicitis /diagnosis /ultrasonography; Decision Support Techniques; Diagnosis, Differential; Emergency Service, Hospital; Humans; Models, Statistical; Practice Guidelines as Topic; Sensitivity and Specificity AccessionNumber 11995001712 Date bibliographic record published 30/04/2004 Date abstract record published 30/04/2004 Record Status This is a critical abstract of a systematic review that meets the criteria for inclusion on DARE. Each critical abstract contains a brief summary of the review methods, results and conclusions followed by a detailed critical assessment on the reliability of the review and the conclusions drawn.

Database of Abstracts of Reviews of Effects (DARE) Produced by the Centre for Reviews and Dissemination Copyright © 2024 University of York

Homepage

Options

Print

PubMed record

Original research

Share

Message for DARE database users