Dynamic Time Warping with Gradient Boosting Tree Ensemble for 12-Lead ECG Multilabel Classification

Alexander William Wong1, Weijie Sun2, Sunil Vasu Kalmady2, Padma Kaul1, Abram Hindle1
1University of Alberta, 2Canadian VIGOUR Centre


Abstract

Standard 12-lead electrocardiograms (ECGs) are commonly used to detect cardiac irregularities such as atrial fibrillation, blocks and irregular complexes. For the Physionet/CinC 2020 challenge, we built an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features.

We used the ecgpuwave implementation of the Pan Tompkins method to detect the P-wave, QRS complex, and T-wave. We selected templates that exhibited maximum similarity with the rest of the ECG records in the same class using minimum distance criteria to isolate candidate templates for each cardiac abnormality. We leveraged these templates using Dynamic Time Warping (DTW), an algorithm used for measuring similarity between temporal sequences of varying speeds. From the annotated signals we derived the morphology related durations, amplitudes, and intervals. Additional signal representation techniques, including discrete Fourier transformations and polynomial function fitting, were also used to extract features. We concatenate the features for each of the 12 leads and fit an ensemble of gradient boosting trees to predict probabilities of ECG instances belonging to each class. We evaluated our model using a 5-fold cross validation approach, given the challenge provided dataset.

Initial results show that ST-segment elevation is the most challenging label to correctly classify. Averaging the validation set results across a 5-fold cross validation split of the provided data, our model shows an F_2 score of 0.7199 and a G_2 score of 0.5296. Our Physionet preliminary leaderboard results shows our team CVC, at an F_2 Score of 0.681 and G_2 Score of 0.469. These scores were obtained before the inclusion of DTW features, which according to our internal validation shows an improvement in the classification scoring metrics. Therefore the preliminary scores may not accurately reflect the latest methodology described in this abstract.