A Recurrent Neuronal Network Approach for Sepsis Onset Prediction

Matthieu Scherpf, Miriam Goldammer, Hagen Malberg, Felix Gräßer
Technische Universität Dresden


Early detection and treatment of sepsis is of utmost importance concerning sepsis outcome and costs. However, revealing patterns in vital signs and laboratory measurements, capable of reliably predicting sepsis onset, remains challenging. Especially exploiting the time series characteristic of those measurements is expected to play a major role concerning successful sepsis prediction. An additional challenge concerning the applicability of conventional machine learning methodologies are, however, missing values and hence interruptions in those time series of measurements. Within this work, we propose the application of a recurrent neuronal network (RNN), consisting of gated recurrent units (GRUs) as hidden layer units, to target the objective to predict sepsis six hours before onset. Here, 10 physiological measurements are included as input to our classification model: Heart rate, oxygen saturation, body temperature, systolic and diastolic blood pressure, respiration rate, white blood cell count, pH value, partial CO2 pressure and the patient age. A sliding window of 20 hours of measurements is shifted over those input time series, aiming at revealing patterns in this multivariate signal sequence. Missing values are imputed according to a last observation carried forward (LOCF) strategy. Samples containing missing values which precede the first valid value are padded with zeros. To evaluate this strategy, we employed a 4-fold cross validation. The training folds were again divided into train (75%) and validation (25%) partitions for hyperparameter tuning and model selection. One crucial hyperparameter to be determined is the threshold of the classifiers output score, which maximizes the utility score. After the first occurrence of a predicted sepsis, the subsequent samples are also labeled as sepsis samples. The 4-fold cross validation including approximately 40000 subjects, yields a mean utility score of 0.1904 with variance 0.0422.