Introduction: Sepsis is a potentially life-threatening condition resulting from a dysregulated bodily response to infection that causes tissue damage, organ failure, and in severe cases, death. Currently, in clinical practice, much of the early diagnosis of sepsis is dependent on the ability of a healthcare professional diagnose patients at risk for developing sepsis. The development of an accurate early detection model would present a significant development in reducing the mortality rate of diagnosed sepsis cases.
Methods: As part of the CinC/PhysioNet 2019 challenge, time-series physiological and clinical measurements from a dataset of 40,336 patients was analyzed by applying an automated time-series feature extraction algorithm. Additionally, imputation of missing values was performed using an expert systems approach based on a knowledge base of relevant clinical practices. From the extracted features, logistic regression was applied to predict the probability of a time points within the patient data as being within 6 hours of sepsis onset. By grouping high probability time points together, a model was developed to predict when sepsis onset occurs for a given patient. This was developed and tuned using a patient-level 5-fold cross-validation scheme.
Results: With the available data, the proposed model achieved an average area under the ROC curve of 0.72, an average accuracy of 89%, and an average utility score (as defined in the PhysioNet challenge) of 0.28.
Discussion: As a linear model, the statistical importance and interpretations of the features will guide further refinement, and the resulting tuned model will then be compared with traditional survival analyses and deep learning approaches. However, our present results suggest that our model has the potential to better inform clinical diagnosis of sepsis in the intensive care unit and guide further innovations in sepsis diagnosis through model interpretation.