Automated Diagnosis of 12-lead Electrocardiograms using an Ensemble of Generic Temporal Convolutional Networks

Max Bos1, Jeroen Vranken1, Rutger van de Leur2, RenĂ© van Es2
1Informatics Institute, University of Amsterdam, 2UMC Utrecht


Aims This study aimed to improve automatic diagnosis of a selection of nine predefined electrocardiogram (ECG) abnormalities as part of the 2020 Physionet/Computing in Cardiology Challenge.

Methods A deep neural network was constructed with exponentially dilated causal convolutions. The architecture belongs to the family of generic temporal convolutional networks (GTCNs), and is composed of several 1-dimensional causal convolution blocks, followed by a 1D global max pooling layer squeezing the temporal dimension and a linear layer.

The network was trained on the challenge dataset and used only the first 5000 samples during training. Samples with a shorter length were zero-padded per training batch. The network was pre-trained on a physician-annotated dataset of 180.573 12-lead ECGs, acquired in the UMC Utrecht. The same classes as the provided dataset were derived from the annotations using a text mining algorithm.The weights of the convolutional layers were frozen and only the last fully connected layer was retrained on the challenge dataset.

10-fold cross-validation was applied and the models with the highest geometric mean in each fold were selected and used to construct an ensemble. To obtain final probability scores, the mean of the 10 individual model probability outputs was taken. Final predictions were obtained by using a threshold of 50\%. Since our selected network architecture allows for an arbitrary input length, the full length of each test sample was used during inference.

Results Cross-validated performance measured an F2-score, G2-score and geometric mean of the F2- and G2-scores of 0.816 $\pm$ 0.009, 0.586 $\pm$ 0.010, 0.692 $\pm$ 0.009, respectively. In the independent test set, the obtained F2- and G2-scores and their geometric mean were $0.830$, $0.595$, and $0.703$. This places our network on the 10th place on the phase I leaderboard.

Conclusion The proposed ensemble of GTCNs showed excellent discriminatory performance in the independent test set.