Orientation-Robust Latent Motion Trajectory Learning for Annotation-free Cardiac Phase Detection in Fetal Echocardiography

Orientation-Robust Latent Motion Trajectory Learning for Annotation-free Cardiac Phase Detection in Fetal Echocardiography
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Fetal echocardiography is essential for detecting congenital heart disease (CHD), facilitating pregnancy management, optimized delivery planning, and timely postnatal interventions. Among standard imaging planes, the four-chamber (4CH) view provides comprehensive information for CHD diagnosis, where clinicians carefully inspect the end-diastolic (ED) and end-systolic (ES) phases to evaluate cardiac structure and motion. Automated detection of these cardiac phases is thus a critical component toward fully automated CHD analysis. Yet, in the absence of fetal electrocardiography (ECG), manual identification of ED and ES frames remains a labor-intensive bottleneck. We present ORBIT (Orientation-Robust Beat Inference from Trajectories), a self-supervised framework that identifies cardiac phases without manual annotations under various fetal heart orientation. ORBIT employs registration as self-supervision task and learns a latent motion trajectory of cardiac deformation, whose turning points capture transitions between cardiac relaxation and contraction, enabling accurate and orientation-robust localization of ED and ES frames across diverse fetal positions. Trained exclusively on normal fetal echocardiography videos, ORBIT achieves consistent performance on both normal (MAE = 1.9 frames for ED and 1.6 for ES) and CHD cases (MAE = 2.4 frames for ED and 2.1 for ES), outperforming existing annotation-free approaches constrained by fixed orientation assumptions. These results highlight the potential of ORBIT to facilitate robust cardiac phase detection directly from 4CH fetal echocardiography.


💡 Research Summary

The paper introduces ORBIT (Orientation‑Robust Beat Inference from Trajectories), a self‑supervised framework that automatically detects end‑diastolic (ED) and end‑systolic (ES) frames in fetal four‑chamber (4CH) echocardiography without any manual annotations. The motivation stems from the clinical importance of these phases for congenital heart disease (CHD) screening and the lack of fetal ECG, which makes manual identification labor‑intensive. Existing supervised approaches require large labeled datasets, while prior unsupervised methods either assume a fixed imaging orientation (typical for adult studies) or rely on reconstruction losses that struggle with the wide orientation variability inherent to fetal scans.

ORBIT tackles these challenges through two key innovations. First, it adopts a registration‑based self‑supervision strategy that introduces an abstract reference frame R. For each video frame Iₜ, an auto‑encoder predicts a stationary velocity field Vₜ←R that warps Iₜ toward R. By approximating the deformation between any two frames Iᵢ and Iⱼ as Φⱼ←ᵢ = exp(Vⱼ←R – Vᵢ←R), the model can be trained with a normalized cross‑correlation (NCC) loss without ever knowing the true reference. This eliminates the need for explicit ED/ES labels during training.

Second, the model learns a low‑dimensional latent trajectory that captures the temporal dynamics of cardiac deformation. The encoder maps each frame to a D‑dimensional latent vector hₜ, which is decomposed into a static component zₛ (shared across the video) and a motion component zₘₜ. The motion component is constrained to an M‑dimensional subspace (M = 1 or 2) spanned by learnable orthogonal basis vectors. Two small multilayer perceptrons extract zₛ and the coordinates αₜ of zₘₜ in this subspace. The decoder reconstructs the velocity field from the combined latent code (zₛ + zₘₜ). After training, only the encoder is needed: a video is transformed into a sequence of αₜ values that form a 1‑D or 2‑D trajectory. Peaks and valleys in this trajectory correspond to transitions between contraction and relaxation, i.e., the ES and ED frames. The association of each turning point with a specific phase is determined empirically on a validation set.

The authors evaluated ORBIT on a sizable dataset collected at John Radcliffe Hospital. The training set comprised 422 normal fetal 4CH videos (85 % for training, 15 % for validation). Testing involved 88 normal videos and 156 CHD videos, with frame rates ranging from 47 to 81 Hz. No ED/ES annotations were used during training or validation. ORBIT achieved mean absolute errors (MAE) of 1.9 frames for ED and 1.6 frames for ES on the normal test set, and 2.4 frames (ED) and 2.1 frames (ES) on the CHD set. These results surpass existing annotation‑free methods that assume a fixed orientation, demonstrating robustness to the diverse fetal heart orientations (apical, basal, transverse) present in the data. Moreover, the model’s performance remained stable across different preprocessing strategies (manual cropping versus tool‑based automatic cropping), indicating low sensitivity to preprocessing variations.

In summary, ORBIT shows that (1) self‑supervised registration can provide a powerful learning signal for cardiac motion without any ground‑truth labels; (2) a very low‑dimensional latent trajectory (as few as one dimension) is sufficient to encode the periodic contraction‑relaxation cycle; and (3) this representation is inherently orientation‑robust, enabling accurate phase detection across normal and pathological fetal hearts. The work opens the door to fully automated pipelines for fetal cardiac assessment, where downstream tasks such as ventricular volume measurement, biometric analysis, and CHD classification can be built upon the automatically identified ED/ES frames. Future directions include extending the approach to multi‑view fetal cardiac imaging, real‑time deployment, and integration with downstream diagnostic models.


Comments & Academic Discussion

Loading comments...

Leave a Comment