Identifying Abnormalities in a 12-Lead Electrocardiogram Image Stack with Convolutional Neural Networks

Jenny Venton1, Ashish Sundar1, Ryan Moran1, Spencer A Thomas1, Peter M Harris1, Nadia A S Smith1, Merima Causevic2, Alen Bosnjakovic2, Vedran Karahodzic2, Claudia Nagel3, Axel Loewe3, Nicolas Pilia3, Olaf Dössel3, Loïc Coquelin4, Jane V Lyle5, Philip J Aston6
1National Physical Laboratory, 2Institute of Metrology of Bosnia and Herzegovina, 3Institute of Biomedical Engineering, Karlsruhe Institute of Technology, 4National Laboratory of Metrology and Testing, 5University of Surrey, Mathematics, 6National Physical Laboratory/University of Surrey, Mathematics


Aim: The aim of this study was to identify the clinical diagnoses present in a 12-lead electrocardiogram (ECG) by transforming the signals into images, then using a convolutional neural network to classify the image stack. We used Symmetric Projection Attractor Reconstruction (SPAR) to generate the 2D image of a single ECG signal.

Method: A 3D stack of twelve 2D SPAR images (one for each lead) was generated for each subject in the training set. We used a convolutional neural network classifier with five convolutional layers to extract important features and four dense layers for the classification. To capture the relationships between the twelve ECG signals, the first convolutional layer utilised 3D convolutional filters coupled with 3D pooling filters. A custom loss function based on the Fβ and Gβ definitions was used to classify each subject into one or more diagnostic classes.

We aim to improve network performance by deriving additional features prior to the classification, including ECG intervals. Further image stacks comprising frequency-based scalogram and spectrogram images derived from the ECG signals will provide further features for classification. Finally, we will use techniques that make use of label dependence and label re-weighting to improve classification of poorer performing labels.

Results: Preliminary 3-fold cross validated training performance results using the SPAR image stack are Fβ = 0.47, Gβ = 0.22 (β = 2).

Conclusion: Visual representations of ECG signals make it possible to use the power of a convolutional neural network, that is typically used in image classification problems, to classify 12-lead ECG signals. Further-more, using a 3D filter exploits the relationships between the signals for the different leads.