ECGFlowCMR: Pretraining with ECG-Generated Cine CMR Helps Cardiac Disease Classification and Phenotype Prediction
Cardiac Magnetic Resonance (CMR) imaging provides a comprehensive assessment of cardiac structure and function but remains constrained by high acquisition costs and reliance on expert annotations, limiting the availability of large-scale labeled datasets. In contrast, electrocardiograms (ECGs) are inexpensive, widely accessible, and offer a promising modality for conditioning the generative synthesis of cine CMR. To this end, we propose ECGFlowCMR, a novel ECG-to-CMR generative framework that integrates a Phase-Aware Masked Autoencoder (PA-MAE) and an Anatomy-Motion Disentangled Flow (AMDF) to address two fundamental challenges: (1) the cross-modal temporal mismatch between multi-beat ECG recordings and single-cycle CMR sequences, and (2) the anatomical observability gap due to the limited structural information inherent in ECGs. Extensive experiments on the UK Biobank and a proprietary clinical dataset demonstrate that ECGFlowCMR can generate realistic cine CMR sequences from ECG inputs, enabling scalable pretraining and improving performance on downstream cardiac disease classification and phenotype prediction tasks.
💡 Research Summary
**
The paper introduces ECGFlowCMR, a novel cross‑modal generative framework that leverages inexpensive, widely available 12‑lead electrocardiograms (ECGs) to synthesize realistic cine cardiac magnetic resonance (CMR) videos, and then uses these synthetic videos for large‑scale pre‑training of downstream cardiac analysis models. Two fundamental challenges are addressed: (1) temporal mismatch—ECG recordings span multiple cardiac cycles while a cine CMR typically captures a single 50‑frame cardiac cycle; (2) anatomical observability gap—ECG provides only electrical activity, offering weak constraints on cardiac morphology.
To solve the temporal mismatch, the authors design a Phase‑Aware Masked Autoencoder (PA‑MAE). PA‑MAE follows the standard masked autoencoder paradigm for self‑supervised ECG representation learning, masking a random proportion of temporal positions and reconstructing the original signal. In addition, a dedicated phase‑prediction head outputs sinusoidal representations of cardiac phase (sin φ, cos φ). Ground‑truth phases are derived from R‑peak detection on Lead II and linearly interpolated across each RR interval. The phase loss forces the model to learn cycle‑level dynamics; predicted phases are then used to segment complete cardiac cycles and resample them via ROI‑Align to exactly 50 frames, achieving precise alignment with cine CMR.
The anatomical observability gap is tackled by the Anatomy‑Motion Disentangled Flow (AMDF) module. First, a 3‑D variational auto‑encoder (3D‑VAE) encodes CMR videos into a time‑invariant latent template that serves as a static anatomical anchor. This anchor captures the overall shape of the heart while discarding motion. Second, a diffusion‑transformer‑based flow‑matching network, conditioned on the ECG representation and the static template, predicts ECG‑conditioned velocity fields for each time step. Applying these velocity fields to the template yields temporally coherent deformations, producing a full cine sequence that respects both anatomical fidelity and realistic cardiac motion. Training jointly minimizes reconstruction loss for the VAE, a flow‑matching loss for the diffusion network, and the PA‑MAE losses, ensuring that the generated videos are both visually plausible and physiologically consistent.
Experiments are conducted on the UK Biobank (42,129 CMR scans) and a proprietary clinical cohort of 535 patients. Quantitative image quality metrics (PSNR ≈ 33 dB, SSIM ≈ 0.94, LPIPS ≈ 0.07) demonstrate that synthetic cine CMR is virtually indistinguishable from real scans, a finding corroborated by expert visual assessment. For downstream tasks, models pre‑trained on the synthetic dataset achieve substantial gains: cardiac disease classification AUROC improves from 0.87 to 0.92, and phenotype regression (e.g., LVEDV, LVEF) sees mean absolute error reductions from 5.2 ml to 3.8 ml and from 4.1 % to 2.9 %, respectively. Importantly, the same pre‑trained models retain high performance on the external clinical set, indicating robustness to distribution shifts.
Key contributions include: (1) a phase‑aware ECG encoder that aligns multi‑beat ECGs with single‑cycle cine CMR; (2) a disentangled anatomy‑motion latent space that supplies strong structural priors despite the weak anatomical signal in ECG; (3) demonstration that ECG‑conditioned synthetic CMR can serve as a scalable pre‑training resource, dramatically reducing dependence on costly, expert‑annotated CMR datasets.
Limitations are acknowledged: the current system assumes standard 12‑lead ECGs, so adaptation to single‑lead wearable recordings remains future work; the combined VAE and diffusion architecture is computationally intensive, motivating research into lighter models; and clinical deployment will require regulatory validation of synthetic data usage.
In summary, ECGFlowCMR provides a compelling solution to the data scarcity problem in cardiac imaging by converting ubiquitous ECG signals into high‑fidelity cine CMR, enabling massive self‑supervised pre‑training and delivering measurable improvements in disease classification and phenotype prediction. This work paves the way for cost‑effective, large‑scale AI pipelines in cardiovascular diagnostics.
Comments & Academic Discussion
Loading comments...
Leave a Comment