In this work, we built an interpretable model to classify 12-lead ECGs based on attention for the PhysioNet/Computing in Cardiology Challenge 2020. We first use discrete-time fast Fourier transform to extract important characteristics of the given high-frequency signals. Moreover, since information about different classification outcomes might be present only in specific segments, we tune our feature representation to show the shift in frequency distribution as we move through time. This is done by first representing the original signal as a spectrogram, which shows the frequency spectrum of the signal during different time windows. This spectrogram is then inputted to an LSTM network. The outputs of the LSTM network at each stage are then used as attention vectors. The attention vectors are then multiplied with the original signal window embeddings before they are summed up and multiplied by a final linear architecture that provides an output.
The advantage of such an architecture is that it allows us to use state of the art sequence models with the added benefit of being able to quantify the contribution of each time window to the final output of the model. This architecture gives a way to interpret which windows contributed positively and negatively to the prediction and thus provides a tool that can highlight the relevant information to healthcare professionals for further validation.
Our current approach achieves a geometric mean score of 0.596 on the leaderboard of the unofficial phase of the challenge. In the future, we intend to further develop our methods by continuing to leverage the power of both time and frequency domains. The next step is to adjust our spectrogram calculations to heartbeat windows instead of fixed time windows. We will also explore changing our attention mechanism architecture in ways that prove to be more beneficial for the given problem.