Multi-class classification of pathologies found on short electrocardiogram signals

Georgi Nalbantov, Svetoslav Ivanov, Jeffrey van Prehn
Data Science Consulting Ltd.


Abstract

Introduction. The ability to predict within a timeframe several key cardiac pathologies simultaneously, based on ECG (electrocardiogram) signals, is key towards establishing a real-world application of AI models in cardiology. Such a multi-class classification task requires not only well-performing binary classification models, but also a way to combine such models into an overall classification modelling structure. We have approached this task using materials from the Physionet 2020 Challenge.

Materials and Methods. A total of 6877 short ECG (12-lead) strips have been supplied by the Physionet 2020 Challenge, which contains emanations of 8 common ECG pathologies (sometimes more than one pathology in a given strip) as well as normal sinus rhythms. A total of 6622 unique strips remained after removing duplicates. An annotation tool for labelling ECG wave points and intervals/templates has been created in Matlab, and used for labelling PVC and PAC intervals, as well as noisy intervals and inconsistencies between the ECG data and the pre-assigned labels. Several binary classifiers were built, where morphological features specific to each pathology had been generated from the signals. The binary classifiers were combined into a final multi-class classifier using an ECOC methodology.

Results. The multi-class classification scores were 0.779, 0.569 and 0.666, for F-beta (beta = 2), G-beta and the geometric mean of these two measures, respectively on the test set of the Challenge, and cross-validation performance on the training set of 0.797, 0.608 and 0.696 on the same measures.

Conclusion. We have created a multi-class classification machine learning approach for predicting ECG pathologies. We generate specific ECG features pertaining to respective ECG pathologies, annotation of ECG signals to extract samples of these features and an expert-based tree structure for combining binary models into a multi-class classification scheme, where an ECOC methodology for combining binary classifiers has been employed.