Authors' conclusions
The literature on the test performance of clinical symptoms and signs, laboratory and imaging tests, and multivariable diagnostic scores for the diagnosis of acute appendicitis is large, but it consists almost exclusively of studies at moderate risk of bias, primarily because of differential and incomplete verification. The few studies that assess multiple tests are typically not designed with the goal of providing comparative information. Thus, the available evidence supports fairly strong conclusions about the performance of individual tests, but it is largely insufficient to support conclusions about comparative effectiveness, especially with respect to clinical outcomes. Clinical symptoms and signs and laboratory tests have relatively limited test performance when used in isolation. Their combination in multivariable scores is promising, but the best studied scores were developed before the widespread use of imaging modalitie, and more recently developed
scores have not yet been studied adequately. All three major imaging modalities have adequate test performance. Evidence on CT is mature for most patient populations of interest. In contrast, MRI has been investigated in fewer studies, many of which focus on its use for pregnant women. US produces nondiagnostic scans more often than CT or MRI, and when a diagnosis is possible, its performance appears to be somewhat worse than CT and MRI. Beyond test performance, information on patient-relevant outcomes and resource use is very limited.
Information on test-related harms (e.g., adverse events due to radiation) is provided by only a minority of studies and is poorly reported. More research, much of which could be accomplished through nonrandomized studies, is needed to establish the performance in understudied patient populations (very young children, women of reproductive age, the elderly) and modalities (e.g., MRI, multivariable scores); compare competing tests; identify factors that affect performance; and evaluate the impact of testing strategies on patient-relevant outcomes, resource use, and harms. Perhaps most importantly, given the large volume of accumulated evidence on the performance of various tests, decision and simulation modeling (e.g., decision analysis, simulation modeling of the impact of radiation on long-term outcomes) should be used to guide decisionmaking and to inform the design of future studies.
Address for correspondence
AHRQ, Center for Outcomes and Evidence Technology Assessment Program, 540 Gaither Road, Rockville, MD 20850, USA Email: AHRQTAP@ahrq.hhs.gov