EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment
Cancer progression arises from interactions across multiple biological layers, especially beyond morphological and across molecular layers that remain invisible to image-only models. To capture this broader biological landscape, we present EXAONE Path 2.5, a pathology foundation model that jointly models histologic, genomic, epigenetic and transcriptomic modalities, producing an integrated patient representation that reflects tumor biology more comprehensively. Our approach incorporates three key components: (1) multimodal SigLIP loss enabling all-pairwise contrastive learning across heterogeneous modalities, (2) a fragment-aware rotary positional encoding (F-RoPE) module that preserves spatial structure and tissue-fragment topology in WSI, and (3) domain-specialized internal foundation models for both WSI and RNA-seq to provide biologically grounded embeddings for robust multimodal alignment. We evaluate EXAONE Path 2.5 against six leading pathology foundation models across two complementary benchmarks: an internal real-world clinical dataset and the Patho-Bench benchmark covering 80 tasks. Our framework demonstrates high data and parameter efficiency, achieving on-par performance with state-of-the-art foundation models on Patho-Bench while exhibiting the highest adaptability in the internal clinical setting. These results highlight the value of biologically informed multimodal design and underscore the potential of integrated genotype-to-phenotype modeling for next-generation precision oncology.
💡 Research Summary
The paper “EXAONE Path 2.5: Pathology Foundation Model with Multi-Omics Alignment” introduces a novel foundation model designed to overcome the limitations of image-only computational pathology. Recognizing that cancer progression is driven by complex interactions across multiple biological layers invisible to morphology alone, the authors propose a framework that jointly learns from five complementary modalities: histologic (Whole-Slide Images - WSI), genomic (Single-Nucleotide Polymorphisms - SNP and Copy-Number Variations - CNV), epigenetic (DNA methylation), and transcriptomic (bulk RNA-seq) data. The goal is to create an integrated patient representation that more comprehensively reflects tumor biology from genotype to phenotype.
The model’s architecture is built upon three key technical innovations. First, a Multimodal SigLIP Loss extends the sigmoid-based contrastive learning objective to handle multiple modalities. It treats every possible pair among the five modalities as an independent binary classification task, allowing the model to learn dedicated interaction spaces for each modality pair (e.g., how SNPs relate to WSI patterns) without the positive-pair competition inherent in standard softmax-based CLIP losses.
Second, to address the loss of spatial context when processing gigapixel WSIs as unordered patch sets, the authors developed a Fragment-Aware Rotary Position Encoding (F-RoPE) module. This component preserves the spatial geometry within coherent tissue fragments (identified via segmentation) using Rotary Positional Embeddings (RoPE), while employing an attention mask to prevent spurious correlations between patches from physically disconnected fragments on the same slide. This ensures the model captures region-level histologic patterns that are often linked to molecular states.
Third, the model leverages internal, domain-specialized foundation models for both WSI and RNA-seq data. These are large models pretrained independently on massive histopathology and transcriptomics datasets, respectively. They provide robust, biologically grounded embeddings as frozen feature extractors, which significantly stabilizes and enhances the subsequent multimodal alignment training, leading to high data efficiency.
The model was evaluated extensively against six state-of-the-art pathology foundation models (TITAN, PRISM, CHIEF, GigaPath, UNI, H-optimus-0) on two complementary benchmarks: an internal, real-world multi-institutional clinical dataset and the public Patho-Bench suite covering 80 diverse tasks. The internal benchmarks focused on predicting clinically relevant biomarkers like Tumor Mutational Burden (TMB), EGFR/KRAS mutations in lung adenocarcinoma, and Microsatellite Instability (MSI) in colorectal cancer across hospitals in Korea and the US.
Results demonstrated that EXAONE Path 2.5 achieved the highest average AUROC (0.7675) on the internal clinical benchmarks, showing superior adaptability to real-world, heterogeneous data. On the comprehensive Patho-Bench, it performed on par with the leading models. Crucially, it achieved this competitive performance while being significantly more parameter-efficient (approximately 30M parameters) and trained on less data than several billion-parameter competitors. This highlights the value of a biologically informed, systematic multimodal design over mere scale expansion.
In conclusion, EXAONE Path 2.5 represents a significant step towards explainable, integrated genotype-to-phenotype modeling in computational pathology. Its success underscores the potential of principled architectural design that respects biological hierarchy and clinical variability for advancing next-generation precision oncology.
Comments & Academic Discussion
Loading comments...
Leave a Comment