Early Prediction of Sepsis Using Sequence Tagging with a Bidirectional LSTM-Conditional Random Field Deep Neural Network

LUAN TRAN and Cyrus Shahabi
University of Southern California


Abstract

Introduction: Early sepsis detection is crucial because it associates with approximately a 4-8% increase in mortality for each hour of delayed treatment.

Aims: In this study, we propose a neural network to early detect sepsis based on physiological data.  

Methods: We formalize this problem as a sequence labeling task which assigns a label (sepsis or not) for each timestamp based on 40 time-dependent physiological variables. Primarily, we interpolate missing data and scale data features to the range from 0 to 1. Our proposed neural network consists of 3 residual learning blocks sequentially and followed by a CRF  layer to output the probability of sepsis in each timestamp. Each residual learning block consists of the following layers in order: Batch Normalization, Relu Activation, Convolutional, LSTM, Relu Activation layer. We use Convolutional and Bidirectional LSTM layers to capture the latent features and temporal dependency, respectively. Batch Normalization is used for reducing overfitting. The output and input of the residual block are summed up using an add operation. The last CRF layer is used to capture the transition probability between timestamps.  The network takes raw data as input and does not require any feature engineering. Because the training dataset is highly imbalanced, we apply a sampling technique in which for each sequence that has sepsis we randomly cut some portions of it to generate new training sequences.

Results: We evaluate our method with 40336 records provided by PhysioNet for the challenge of 2019 using 5-fold cross-validation. In each fold, we use 32268 records for training and the remaining 8068 records for testing. The average  Receiver-Operating-Characteristic AUC, Precision-Recall AUC, Accuracy, F-measure, and Utility are 0.95, 0.37, 0.98, 0.38 and 0.68, respectively. We show that our proposed technique is superior in all metrics as compared to those of the Conditional Random Field with feature engineering.