PROSPERO International prospective register of systematic reviews

Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials

H.J.S. de Grooth, J.J. Parienti, H.M. Oudemans-van Straaten

Citation

H.J.S. de Grooth, J.J. Parienti, H.M. Oudemans-van Straaten. Systematic review of the Sequential Organ Failure Assessment score as a surrogate endpoint in randomized controlled trials.
PROSPERO
2016:CRD42016034014
Available from http://www.crd.york.ac.uk/PROSPERO/display_record.asp?ID=CRD42016034014

Review question(s)

The aim of this systematic review is to validate the Sequential Organ Failure Assessment (SOFA) score as a surrogate endpoint for mortality in Randomized Clinical Trials (RCTs). We will use data from published RCTs that report both SOFA and mortality endpoints.

Furthermore, we will try to identify which derivative of the SOFA score is most responsive and consistent in detecting mortality-modifying treatment effects.

Searches

PubMed, MEDLINE and Embase will be searched.

Eligible for inclusion will be RCTs in adult ICU patients reporting both a derivative of the SOFA score and a measure of mortality as primary or secondary endpoints.

Reports in languages other than English will be excluded from the analysis.

Types of study to be included

Eligible for inclusion are RCTs in adult ICU patients reporting both a derivative of SOFA score and a measure of mortality as primary or secondary endpoints.

Condition or domain being studied

Sequential Organ Failure Assessment (SOFA) score as used in randomized controlled trials in critically ill patient populations.

Participants/ population

Randomized controlled trials with adult patients admitted to the intensive care unit.

Intervention(s), exposure(s)

The Sequential Organ Failure Assessment (SOFA) score was developed to describe multiple organ dysfunction in the intensive care unit (ICU) on a scale that is easily parameterized. The SOFA score was quickly recognized as a potential surrogate endpoint for randomized controlled trials (RCTs) because serially measured SOFA scores were associated with mortality independent of admission score. Multiple large observational studies have confirmed that serial SOFA derivatives such as delta SOFA, total maximum SOFA and mean SOFA are reliable predictors of mortality. This has led to an increasing popularity of the use of SOFA derivatives as primary or secondary endpoints in RCTs. But the association between serial SOFA scores and mortality cannot be directly carried over from observational studies to RCTs.

The aim of this study is to quantify the responsiveness and the consistency of different SOFA derivatives to reflect intervention-related changes in mortality risk.

Eligible for inclusion are RCTs in adult ICU patients reporting both a derivative of SOFA score and a measure of mortality as primary or secondary endpoints.

Comparator(s)/ control

The Sequential Organ Failure Assessment (SOFA) score will be compared to mortality as an endpoint in clinical trials.

Outcome(s)

Primary outcomes

1. The responsiveness of SOFA score as a surrogate endpoint for mortality: the change in SOFA score in response to a treatment that changes the underlying risk of mortality. The responsiveness of the surrogate endpoint is measured by the coefficient that determines the slope between the standardized between-group SOFA difference and the between-group mortality difference (odds ratio).

2. The consistency of SOFA score to reflect changes in underlying mortality risk. The consistency of SOFA score as a surrogate endpoint is measured by tau and I-squared values of the meta-regression. Consistency will be defined as good, moderate or poor for I-squared values of <25%, 25-50% and >50%, respectively.

Secondary outcomes

The cause of moderate or poor consistency will be explored by adding study-level explanatory variables (e.g. baseline SOFA and trial characteristics) as regressors in the model.

Data extraction, (selection and coding)

For each RCT, we will register and categorize the trial population category, the intervention being tested, single- or multicenter design, the primary endpoint and the analysis type (intention-to-treat or per-protocol). For each treatment arm we will register the sample size, baseline SOFA score, all reported serial SOFA scores (including standard deviation or interquartile ranges and differentiating absolute scores from delta scores) and the reported mortality rates.

Risk of bias (quality) assessment

Trials will be graded according to the Jadad scale.

Strategy for data synthesis

Meta-regression at the study level will be used as the core method for the analysis.

For each trial, mortality will be expressed as the odds ratio (OR) of treatment vs. control group mortality. For studies that report multiple measures of mortality, one measure is chosen in the following order: Mortality measure reported as primary endpoint; 28-day mortality; hospital mortality; 90-day mortality; ICU mortality. For the SOFA score, the unit of analysis for the meta-regression is the standardized difference between the control and intervention groups, defined as the between-group SOFA score difference divided by the standard deviation (SD) of the SOFA score (square root of the mean of variances of both groups). The standardized difference is used instead of the absolute difference to normalize the SOFA effect size across trials with different SOFA score distributions. When SOFA score is reported as median and IQR, the median will be used as the best unbiased estimator of the mean and the SD will be approximated as IQR/1.35.

A mixed-effects meta-regression model is used with log(OR) as dependent variable, SOFA score (standardized difference) as fixed effect independent variable and a random intercept for each study. The random intercept per study is applied to model heterogeneity explicitly. Fixed- and mixed-effects models produce identical results in the absence of significant between-study heterogeneity, but the mixed-effects leads to appropriately increased standard errors when significant heterogeneity occurs. Each study is weighed by the inverse of the sampling variance of the mortality OR (a function of mortality rate and sample size). A restricted maximum likelihood (REML) estimator will be used to estimate heterogeneity. Residuals will be checked for normality and the goodness of fit of the log-linear model will be compared to power quadratic and power models.

The responsiveness of the surrogate endpoint is measured by the coefficient that determines the slope between the standardized between-group SOFA difference and the between-group mortality OR.

The consistency of SOFA score as a surrogate endpoint is measured by tau and I-squared. Tau measures the standardized residual heterogeneity and I-squared describes the percentage total variability that is unexplained by sampling error (chance). Consistency will be defined as good, moderate or poor for I-squared values of <25%, 25-50% and >50%, respectively. The cause of moderate or poor consistency will be explored by adding study-level explanatory variables (e.g. baseline SOFA and trial characteristics) as regressors in the model.

Analysis of subgroups or subsets

The meta-regression will be performed for each derivative of SOFA score: Early absolute score (day 2, 3, 4), late absolute score (day 5-14), total maximum score, delta day-X minus admission and delta maximum minus admission. A study can recur in multiple analyses if more than one SOFA derivative is reported. For this set of regression analyses, the p-values will be corrected for multiple comparisons using the method described by Hommel. The calculation of the different SOFA derivatives will be tabulated for clarity. The responsiveness (regression coefficient) and the consistency (tau and I-squared) will be compared between the different SOFA derivatives to evaluate whether any derivative is especially superior or inferior for use as a surrogate endpoint. Regression coefficients will be compared using t-tests on model coefficients and (pooled) error variances. Nonzero tau values will be comparing using F-tests on tau-squared.

Because the SOFA score was originally designed to quantify sepsis-related organ failure, a subgroup analysis will be performed in the trials with sepsis patient populations. Responsiveness and consistency parameters will be compared for significant differences (correcting for multiple comparisons).

Contact details for further information

H.J.S. de Grooth

VU University Medical Center

dept. of Intensive Care

De Boelelaan 1117

1081 HV Amsterdam

The Netherlands

h.degrooth@vumc.nl

Organisational affiliation of the review

Department of Intensive Care, VU University Medical Center, Amsterdam, The Netherlands

Formal screening of search results against eligibility criteria

Data extraction

Risk of bias (quality) assessment

Data analysis

PROSPERO This information has been provided by the named contact for this review. CRD has accepted this information in good faith and registered the review in PROSPERO. CRD bears no responsibility or liability for the content of this registration record, any associated files or external websites.