Deep Learning Identification of Concurrent Cardiac Arrhythmias from Electrocardiogram Signals

Jhih-Yu Chen1, Tsai-Min Chen2, Chih-Han Huang1, Edward S.C. Shih1, Justine Hsu1, Yi-Ming Chen1, Yu Tsao3, Ming-Jing Hwang1
1Institute of Biomedical Sciences, Academia Sinica, 2Graduate Program of Data Science, National Taiwan University and Academia Sinica, 3Research Center for Information Technology Innovation, Academia Sinica


Aims: This study aims to accurately identify multiple types of cardiac arrhythmias, including concurrently diagnosed ones, from 12-lead electrocardiogram (ECG) signals by a deep learning model of artificial intelligence.

Methods: Our model was constructed with 5 convolutional neural network (CNN) blocks, each including 2 convolutional layers followed by a max-pooling layer and at the end of the 5 blocks concatenating a bidirectional gated recurrent unit layer and an attention layer. The model was trained on a dataset provided by CinC2020, which contained 12-lead electrocardiogram (ECG) signals recorded for thousands of individuals. We randomly and evenly divided the dataset into 10 parts to set up a 10-fold training-validation-test schema with 8 of the 10 folds used for training, and each of the remaining 2 folds for validation and test, respectively. We derived 100 models for each of the 10 training runs, and the model that gave the least loss on the validation set was selected, resulting in 10 best validation models. We evaluated these best validation models on their respective test set using CinC2020-designated scoring functions: area under receiver operation characteristic curve (AUROC), area under precision-recall curve (AUPRC), accuracy, Fmeasure, Fbeta and Gbeta.

Results: On the data released by CinC2020 in the first phase, the median AUROC, AUPRC, accuracy, Fmeasure, Fbeta and Gbeta for the 10-fold tests of our model were 96.5%, 84%, 96.3%, 79.2%, 78.9% and 59.7%, respectively. When evaluated and reported by CinC2020 on the leader board, our model’s results for Fbeta, Gbeta, and geometric mean were 79.6%, 59.4%, and 68.8%, respectively

Conclusions: Our model, despite having a rather simple deep learning architecture, achieved promising results. With more data set to be released by CinC2020 subsequently, our model can be expected to improve further with potential for real-world clinical applications.