Exploring the Effects of Imputation on the Early Prediction of Sepsis

Srinivasan Sivanandan1, Daniel Dastoor1, Sneha Desai1, Angad Kalra1, Liam McCoy2, Michael Detsky1, Ben Fine3
1University of Toronto, 2University of Toronto Faculty of Medicine, 3Trillium Health Partners


Introduction: Sepsis is a major cause of morbidity and mortality in modern intensive care units (ICUs). Early detection and antibiotic treatment of sepsis are critical for improving patient survival in ICUs. Algorithms for the successful early prediction of sepsis from electronic health record physiological signals must deal with a high degree of missingness in laboratory values - values ordered infrequently and at irregular intervals. Reflecting deliberate choices to test or not, the pattern of missing and present values implicitly contain important clinical insights and must be actively considered in both the prediction and imputation processes. Methods: In this paper, we establish and compare results for sepsis prediction with recurrent and non-recurrent prediction models. Further, we compare the predictive performance of our models on datasets imputed by methods of mean and forward imputation, Denoising Autoencoders, learned decay factors, and a Generative Adversarial Network. Through the addition of artificially induced missingness, we compare the performance of these imputation strategies through the reconstruction error of the physiological signals. Results: The best performing model in our experiments was an LSTM model trained on physiological signals and their corresponding missingness indicators, obtaining an AUROC of 0.839 ± 0.004 with a utility score of 0.444 ± 0.015. Conclusion: In exploring these imputation strategies on the predictive performance of our models, our experiments show that the simplest imputation methods perform at least as well as enhanced imputation techniques, demonstrating the orthogonality of imputation and prediction performance.