A Bio-toolKit for multi-cardiac abnormality diagnosis using 12-lead ECG signal and Deep Learning

Akash Kirodiwal1, Apoorva Srivastava1, Ashutosh Dash1, sazedul alam2, Sawon Pratiher3, Amit Patra4, Nirmalya Ghosh5, nilanjan banerjee6
1Electrical Engineering Department,IIT Kharagpur, 2PhD student, CSEE, UMBC., 3Electrical Engineering Department, IIT Kharagpur, 4Professor, IIT kharagpur, 5Assistant Professor, IIT Kharagpur, 6Associate professor, CSEE, UMBC


Abstract

Early-stage clinical diagnosis of cardiac abnormalities can increase the chances of heart patient's survival by predicting cardiovascular morbidity and mortality. Cardiac screening modalities like the 12-lead electrocardiogram (ECG) signals are widely used to detect cardiac arrhythmias. However, manual interpretation of ECG is tedious and domain- expertise dependent. Therefore, automated cardiovascular disease (CVDs) detection from the ever-expanding number of ECG records can aid physicians and cardiac professionals in the prognosis of CVDs. In this work, we propose a bio-toolkit for multi-class arrhythmias classification. We performed a first level signal quality assessment of three augmented limb leads (aVL, aVR, and aVF) using Goldberger's technique. Thereafter, The time-domain features like the RMSSD and pNN50 are extracted using Pan Tompkins algorithm and discrete wavelet-transform method. Statistical significance analysis using Student t-test and use of box-plots show intra-class discriminability. The majority of the features we calculated are motivated by the literature review done on the 2017 PhysioNet/CinC Challenge. Further, we have explored an array of deep learning frameworks like the RNNs family, i.e., LSTM, Bi-LSTM and the convolutional neural networks (CNN) like 1D-CNN for deep cardiac feature learning and classification. Initially, we have developed a 1-Dimensional CNN based model with two convolutional layers, followed by one dropout layer, one pooling layer, and one fully connected layer. We trained our model with the extracted features from a subset of available training data to test its efficacy and used a stochastic gradient descent-based Adam optimizer to optimize the model. Preliminary results with our deep learning model give an overall F_2-score of 0.89 on the challenge dataset. A comparative study using the ECG time-domain features and hierarchical machine learning classifiers projects a standard class-specific accuracy of 96.43% and multi-classification accuracy of 33.69%. Attached block diagram shows the overall problem-solving approach.