Explainable Deep Neural Network for Identifying Cardiac Abnormalities Using Class Activation Map

Yu-Cheng Lin, Yun-Chieh Lee, Wen-Chiao Tsai, Win-Ken Beh, An-Yeu Wu
Graduate Institute of Electronic Engineering, National Taiwan University


Abstract

"Explainable Deep Neural Network for Identifying Cardiac Abnormalities Using Class Activation Map"

Yu-Cheng Lin (b05502145@ntu.edu.tw), Yun-Chieh Lee (b06901014@ntu.edu.tw), Wen-Chiao Tsai (b05901065@ntu.edu.tw), Win-Ken Beh (kane@access.ee.ntu.edu.tw), An-Yeu Wu (andywu@ntu.edu.tw)

“NTU-Accesslab” Team Graduate Institute of Electronic Engineering, National Taiwan University Taipei, Taiwan

In this study, we present a deep convolutional neural network (CNN) approach, called CNN-GAP, for classifying 12-lead ECGs with multilabel cardiac abnormalities. Additionally, Class Activation Mapping (CAM) is employed for further understanding the decision-making process of this black-box model, making the model more explainable. For instance, we have observed that most data of STD or STE type do not focus on the ST segment of the record even when the model makes true prediction on it, which implies that our preprocessing method or model structure needs to be modified to promote the ST segment features. Proposed CNN-GAP model consists of 12 layer Conv Blocks along with Batch Normalization layer, Global Average Pooling and Fully Connected layer with sigmoid activation. To deal with the data imbalance problem, we oversample the minor datas. In the training stage, we applied Macro F_2 loss instead of the conventional Cross entropy loss (CE), and we have shown that this results in faster convergence and higher F_2 scores theoretically and experimentally. Additionally, we augmented datas by randomly flipping and scaling datas to get better scores and prevent model overfitting. Finally, we bootstrap ensemble 3 CNN-GAP models, each training on a different piece of training set, as our final model. This model has get (F_2, G_2 and Geometric mean) scores (0.802, 0.598, 0.693) in the unofficial phase, and average scores (0.816, 0.592, 0.695) on the training set (3-fold cross-validation).