12-Lead ECG classification using non-linear feature-based learning

Jieun Lee, Samuel Newell, Vasanth Ravikumar, Xiangzhen Kong, Yugene Guo, Alena Talkachova
University of Minnesota


Cardiovascular diseases are the leading causes of death globally in the last 15 years. Ischemic heart disease and stroke are of special concern, causing a combined 15.2 million deaths in 2018. Atrial fibrillation, bundle branch blocks, atrioventricular blocks and premature beats point towards abnormal heart rhythms and are precursors to fatal heart conditions. 12-lead ECG recordings conventionally require analysis by a trained clinician to monitor a patient’s heart condition. Automated labeling of these recorded ECGs is desired.

In this work, we implement a multinomial regression model to perform automated labeling of 12-Lead ECGs based on several feature selection criteria. These criteria include: (i) morphological features (name few) of the 12-Lead ECG signals, (ii) linear techniques (RR, HRV) and (iii) nonlinear methods (multiscale frequency, multiscale entropy, kurtosis, time delay embedding dimension) to uncover intrinsic complexity of the ECG signal. These clinically-based and data-driven feature selection criteria were combined with machine learning algorithms to improve outcomes. In addition, lead selection was performed based on specific diseases to improve the physiological correlation of our analysis. Finally, 5-fold cross validation of initial dataset consisting of 6877 patients having single or multiple diseases was performed.

This model distinguishes the different diseases with a high specificity with all the classes having a specificity of at least 82% and obtained a F-score of 0.46. We further aim to optimize feature engineering and include additional morphological parameters to improve the sensitivity without causing a decrease in specificity measures.