Generative pre-training (GPT) using the Transformer architecture has disrupted the field of natural language processing in the recent years. We propose a generalization framework for GPT on electrocardiography (ECG) signals by viewing each ECG recording as a sentence and each cardiac cycle as a word of that sentence. In the context of the CinC2021 challenge we first use this framework to predict cardiac cycles and then to predict cardiac abnormalities.
The model architecture is a transformer encoder with 5 layers, 8 self-attention heads, model dimension 1000, feed forward dimension of 2048 and 0.1 dropout rate. The final layer is replaced by a fully connected layer with a sigmoid activation. Each cardiac cycle is fed to the model as a “word vector” of length 1000 samples. The word vector is generated from all provided leads of a single cardiac cycle through a word vector generator (WWG). As a starting point we use root mean square (RMS) combined with zero-padding such that the R-peak of each cardiac cycle is aligned as a WWG. The maximum sentence length (i.e., number of cycles) is set to 50, shorter recordings are zero-padded at the end and the first 50 cycles are taken from longer recordings.
The model is pre-trained in an unsupervised manner on cardiac cycle prediction. For this task, the final layer of the Transformer architecture is a linear regression layer. The pre-training is done on multiple available ECG datasets other than the CinC2021 training set. For the supervised training, the CinC2021 training set is split into five equal folds stratified by cardiac arrythmia labels and normalized with the standard deviation. The network is trained for 50 epochs using Adam optimizer with a learning rate scheduler.
The cross-validated CinC2021 score of the unofficial phase model is 0.42.