Automatic Cardiac Abnormality Detection in 12-lead ECGs with Deep Convolutional Neural Networks Using Data Augmentation

Lucas Weber1, Maksym Gaiduk1, Ralf Seepold2
1HTWG Konstanz, 2HTWG Konstanz, I.M. Sechenov First Moscow State Medical University


Abstract

In this work, a deep convolutional neural network (CNN) was trained and applied to the data of the Physionet/Computing in Cardiology Challenge 2020, to detect nine different classes of cardiac abnormalities from 12-lead ECGs. The network's objective is to solve a multi-class, multi-label classification problem. In order to learn expressive features from raw data, the training of these networks requires large amounts of training samples, which are not always accessible. To prevent the deep networks from overfitting and facilitate the training, we employed different augmentation techniques to the accessible training data. These techniques include additive Gaussian noise, a method similar to CutOut from the computer vision domain, signal shifting, and the classification of signal sections of different lengths with the same network structure. These techniques provide the possibility to multiply training data and allow the training of deep neural networks despite a limited amount of training data. Besides, due to the use of a global pooling layer after the feature extractor the network is agnostic to the length of the signal.

A mixture of these approaches and early stopping allows the training of a ResNet like network with a little over five million trainable parameters with around five thousand samples. The algorithm obtains an AUPRC of 0.86, an f-beta-score of 0.84, and a g-beta-score of 0.63 on the training dataset (5-fold cross-validation). On the hidden test set of the challenge, an ensemble of all models from the cross-validation obtains an f-beta-score of 0.82 and a g-beta score of 0.62 (UC_Lab_Kn). These results show the potential of deep neural networks for application to a variety of tasks, even if the amount of training data is limited. The training and inference code will be made publicly available on GitHub after the end of the challenge.