Anomaly Detection Semi-supervised Framework Using Deep Reinforcement Learning for Sepsis Treatment.

Ines Krissaane
Harvard Medical School


Abstract

Sepsis defined is life-threatening organ dysfunction that comes from a dysregulated host response to infection (Singer et al., 2016). This disease process is still not fully understood and ongoing research is done to improve treatments. By getting access to a large dataset of more than 40,000 patients, we propose a novel approach for anomaly detection using both unsupervised and supervised methods with autoencoders neural networks (Beggel et al., 2019) and XGBoost algorithm (Chen et al., 2016). When combined into a single dataset, the 40,336 files of the PhysioNet/CinC challenge 2019 contain 1,552,210 records with 42 fields with less than 2% of raw records which indicate sepsis. We propose a framework using auto encoders networks to learn from this challenging imbalanced dataset with few features as we can give insight about which features contribute more to detect sepsis and to identify sepsis optimal time conditions. We then provide an effective way to predict sepsis using XGBoost algorithm widely used by data scientists to achieve state-of-the-art results in machine learning. The framework proposed to discard the implicit time series suggested by the data collection and take advantage of tree boosting methods where you produce an active learner by effectively combining weak learners. Such small trees are not deep which enable them to be more interpretable, and we optimize parameters as the number of trees, the number of iterations, the rate at which the gradient boosting learns and the depth of the tree by using k_fold cross-validations. We obtain an overall normalized score for the training datasets of 0.79 with the utility function created for the Challenge.