|Systematic review and evaluation of physiological track and trigger warning systems for identifying at-risk patients on the ward
|Gao H, McDonnell A, Harrison D A, Moore T, Adam S, Daly K, Esmonde L, Goldhill D R, Parry G J, Rashidian A, Subbe C P, Harvey S
The authors concluded that there is little evidence about the reliability, validity and usefulness of track and trigger warning systems for identifying at-risk patients on the ward. Overall, the review appears to have been well-conducted and the authors’ conclusions about the limitations of the identified data seem appropriate.
To evaluate the reliability, validity and utility of physiological track and trigger warning systems (TTs) for identifying at-risk patients on the ward. The review also included studies that described TTs but these are not included in this abstract.
MEDLINE, MEDLINE In-Process and Other Non-indexed Citations, EMBASE, CINAHL, PsycINFO, the Cochrane Library and Web of Science were searched from 1990 to 2004 for studies published in full in English. Details of the search strategy were presented in a supplement. In addition, three relevant specified journals and the reference lists of key reports and reviews were screened, and experts and professional bodies were contacted for details of further studies. Datasets of TTs were obtained by contacting all acute National Health Service hospitals in England that have critical care facilities. In addition, study members, their contacts and authors of published studies were contacted for further data.
Study designs of evaluations included in the review
The review obtained evidence from primary studies and datasets of patient data. Datasets were excluded if they did not clearly define the inclusion and exclusion criteria, or if fewer than half of the variables were 95% complete. Inclusion criteria for the primary studies were not specified in terms of study design
Specific interventions included in the review
Studies that evaluated TTs were eligible for inclusion. In the review, TTs were classified as:
single-parameter systems (periodic observation of defined criteria compared against predefined thresholds with activation of response when any criterion is met);
multiple-parameter systems (response required the meeting of more than one criterion);
aggregated scoring systems (weighted scores assigned to physiological values and compared with predefined trigger thresholds); and
combination systems (single- or multiple-parameter systems used in combination with aggregated weighted scoring systems).
Reference standard test against which the new test was compared
Diagnostic accuracy datasets were eligible for inclusion if they used the presence of critical illness (a composite of death, admission to critical care, ‘do not attempt resuscitation’ or cardiopulmonary resuscitation) as the reference standard.
Participants included in the review
Studies of adult in-patients not in critical care areas were eligible for inclusion. Data were removed from datasets if patients were younger than 12 years of age.
Outcomes assessed in the review
Datasets that assessed admission to critical care or death were eligible for inclusion. The primary review outcomes were the sensitivity (proportion of patients with established critical illness who triggered response) and positive predictive value (PPV; proportion of triggered patients with established critical illness), while the secondary outcomes were the specificity and negative predictive value (NPV). The included studies assessed hospital mortality, 30- and 60-day mortality, intensive care unit or high-dependency unit admission, composite outcomes and cardiopulmonary resuscitation; they used a variety of methods to report diagnostic accuracy. Most of the included datasets assessed outcomes using aggregated scoring systems; all TTs included heart rate, respiratory rate, systolic blood-pressure and level of consciousness. The datasets used assorted additional variables and different response algorithms.
How were decisions on the relevance of primary studies made?
Two reviewers independently selected the studies.
Assessment of study quality
Studies were assessed using the criteria described by Laupacis et al., while the validity of clinical decision rules was assessed using criteria defined by McGinn et al. (see Other Publications of Related Interest, nos. 1-2). Datasets were assessed for methodological quality using the Quality of Assessment of Diagnostic Accuracy Studies (QUADAS) tool.
One reviewer extracted the data, which a second reviewer then checked. Logic, range and consistency checks were applied to check the validity of the datasets. Illogical values, outliers, cases where the anonymous unique patient identifier and date of admission were missing, and cases where a composite outcome could not be identified were removed.
Methods of synthesis
How were the studies combined?
The studies were grouped by source of data (primary study or dataset). Primary studies reporting on the diagnostic accuracy of TT systems were combined in a narrative. For datasets, pooled median and quartiles were reported for the sensitivity, PPV, specificity and NPV. Receiver operating characteristic (ROC) curves were constructed for datasets and ranges in area under the curve were reported. A summary ROC curve was estimated for datasets that used the composite outcome in critical care follow-up and medical admission unit patients.
How were differences between studies investigated?
Differences between the datasets were considered with respect to age and specialty. Statistical heterogeneity was assessed for the 11 datasets with critical care follow-up using the Q statistic for the diagnostic odds ratio and the H-statistic. Random-effects meta-regression was used to determine the extent to which the following variables explained the heterogeneity among these 11 datasets: physiological parameters used in each TT; recorded outcome variables; and the inclusion of critical care follow-up compared with all ward or medical admission unit patients.
Results of the review
Five studies that reported the diagnostic accuracy of 4 TT systems (n=7,873) and 15 datasets (n=20,197) were included.
Diagnostic accuracy of the TTs reported in the studies.
None of the studies met all methodological criteria and none reported a measure of variability around estimates of diagnostic accuracy. None of the TTs met the requirements of a level 1 clinical decision rule (a rule validated in various settings with ‘confidence that it can change clinical behaviour and improve patient outcomes’). Where reported, estimates of sensitivity varied from 8 to 100% and estimates of specificity varied from 18 to 100%.
Diagnostic accuracy of the TTs from datasets.
Data were only received from 31 of the 92 hospitals that indicated they collected data and only 15 of these fulfilled the inclusion criteria. The diagnostic accuracy varied widely. Sensitivities and PPVs were low: the median sensitivity was 43.3 (quartiles: 25.4, 69.2) and the median PPV was 36.7 (quartiles: 29.3, 43.8). Specificities and NPVs were ‘generally acceptable’: the median specificity was 89.5 (quartiles: 64.2, 95.7) and the median NPV was 94.3 (quartiles: 89.5, 97.0). The area under the curve ranged from 0.61 to 0.84. Diagnostic accuracy was heterogeneous (p<0.001) among the 11 datasets with critical care follow-up. Meta-regression failed to identify the causes of this heterogeneity.
There was little evidence about the reliability, validity and usefulness of TTs.
The review question was defined in terms of the participants, intervention and outcomes. Primary studies and datasets were eligible. Several relevant sources were searched but no attempts were made to minimise publication or language bias. Methods were used to minimise reviewer error and bias in the study selection and data extraction processes, but it is unclear whether similar steps were taken for the assessment of validity. Validity was assessed and the methodological limitations of the primary studies discussed; although the validity of the datasets was apparently assessed, the results were not reported. In view of the diversity of the studies, it seems appropriate to have combined the primary studies in a narrative and combined data from datasets using medians, and explored sources of heterogeneity. Overall, the review appears to have been well-conducted and the authors’ conclusions about the limitations of the identified data seem appropriate.
Implications of the review for practice and research
Practice: The authors stated that, in view of the limited evidence about reliability, validity and utility, TTs should be used in conjunction with clinical judgement. Hospitals thinking about introducing TTs should take account of the latest evidence about the reliability and validity of systems, and should consider seeking a system that suits local conditions, is easy to use in practice and is acceptable to patients and staff.
Research: The authors stated that there is a need to validate TTs in their current settings. Large prospective studies are required to develop TTs that meet level I clinical decision rules (i.e. a rule validated in various settings with ‘confidence that it can change clinical behaviour and improve patient outcomes’).
UK National Health Service Research and Development Service Delivery & Organisation programme (SD0/74/2004).
Gao H, McDonnell A, Harrison D A, Moore T, Adam S, Daly K, Esmonde L, Goldhill D R, Parry G J, Rashidian A, Subbe C P, Harvey S. Systematic review and evaluation of physiological track and trigger warning systems for identifying at-risk patients on the ward. Intensive Care Medicine 2007; 33(4): 667-679
Other publications of related interest
1. Laupacis A, Sekar N, Stiell IG. Clinical prediction rules; a review and suggested modifications of methodological standards. JAMA 1997;277:488-94.
2. McGinn TG, Guyatt GH, Wyer PC, Naylor CD, Stiell IG, Richardson WS. Users' guides to the medical literature: XXII: how to use articles about clinical decision rules. JAMA 2000;284:79-84.
Subject indexing assigned by NLM
APACHE; Critical Care /methods; Databases, Factual; Emergency Medical Services /statistics & Hospital Mortality; Humans; Intensive Care Units; numerical data
Date bibliographic record published
Date abstract record published
This is a critical abstract of a systematic review that meets the criteria for inclusion on DARE. Each critical abstract contains a brief summary of the review methods, results and conclusions followed by a detailed critical assessment on the reliability of the review and the conclusions drawn.