Motion estimation is classically achieved by pairwise intensity-based registration methods, which are based on optimising a similarity metric. However, engineering a metric that is sufficiently robust for cardiac applications has been challenging due to the inherent spatial and temporal variabilities. Recently, convolutional neural networks (CNN) have shown the potential to predict spatial correspondence between pair of given images, without engineered metrics, but voxel-level ground truth for their learning correspondence is scarce.
The aim of this work is to develop a deep learning framework (Fig. A), which uses synthesised training data and provides a fast solution for cardiac motion estimation. To synthesise left ventricular cavity and augment manually labelled datasets, firstly, a 3D generative adversarial network was trained on end diastole and end systole phases of 30 cine cardiac MRI datasets. Secondly, a CNN was designed to learn a spatial and temporal dense displacement field (DDF) necessary for predicting voxel correspondence. Results were evaluated by deforming the source segmentation from end diastole phase into the target at end systole. The deformed segmentation was then compared to ground truth at end systole using the Dice score.
Classical intensity-based registration method with optimal hyperparameters achieved the Dice score of 0.79±16 across all 30 cases (~5min per case). CNN-based method achieved an accuracy of 0.82±09 in Dice (~1min per case). The augmentation further improved the final score to 0.85±14.
Our networks trained on sparse annotations provide a real time motion estimation software, useful for characterising mechanical activation patterns.