Regularization and Augmentation in 12-Lead Electrocardiograms Classification Using Artificial Neural Networks

Konstantin Egorov
Sberbank AI Lab


Electrocardiogram (ECG) is one of the the most common diagnosis method for various heart diseases. While the equipment becomes cheaper every day, there is heavy world-wide shortage of experienced specialists who can interpret ECG with high accuracy, thus increasing importance of automatic interpretation software. Recent advance in machine learning brought very powerful tools, such as artificial neural networks (ANNs), It was proven that models, based on ANNs, can interpret ECG signal on par with experienced cardiologists and the only limiting factor is amount and quality of data, used for training. ANNs require thousands of records to achieve human-level performance and currently there is no open database, that can be used for successfully train such accurate algorithms. Therefore, it is important to use available databases as efficient as possible. Our work is focused on discovering ways to reduce overfitting and increase generalisation during training ANNs on limited amount of data. We are using dataset, provided by PhysioNet/CinC Challenge 2020. To reduce overfitting, we tested and benchmarked several common techniques, such as L1/L2 regularisation, dropout and batch/layer normalisation. We discovered set of augmentations, that significantly improves performance. We also trained autoencoder network and used encoder part of it as pretrained weights for classification network. Another very efficient method that we tested is adding auxiliary targets, such as different pathologies, age and sex of the patient, during training of ANNs. Using such methods we significantly increased quality of our baseline network and achieved overall f2 score 0.811, g2 score 0.599 and geometric mean 0.697 on challenge leaderboard.