Precise delineation of anatomical and pathological structures within 3D medical volumes is crucial for accurate diagnosis, effective surgical planning, and longitudinal disease monitoring. Despite advancements in AI, clinically viable segmentation is often hindered by the scarcity of 3D annotations, patient-specific variability, data privacy concerns, and substantial computational overhead. In this work, we propose FALCON, a cross-domain few-shot segmentation framework that achieves high-precision 3D volume segmentation by processing data as 2D slices. The framework is first meta-trained on natural images to learn-to-learn generalizable segmentation priors, then transferred to the medical domain via adversarial fine-tuning and boundary-aware learning. Task-aware inference, conditioned on support cues, allows FALCON to adapt dynamically to patient-specific anatomical variations across slices. Experiments on four benchmarks demonstrate that FALCON consistently achieves the lowest Hausdorff Distance scores, indicating superior boundary accuracy while maintaining a Dice Similarity Coefficient comparable to the state-of-the-art models. Notably, these results are achieved with significantly less labeled data, no data augmentation, and substantially lower computational overhead.
FALCON: Few-Shot Adversarial Learning for
Cross-Domain Medical Image Segmentation
Abdur R. Fayjie*1, Pankhi Kashyap2, Jutika Borah3, and Patrick
Vandewalle1
1KU Leuven, Leuven, 3000, Belgium.
2IIT-Bombay, Mumbai, 400076, India.
3Tezpur University, Tezpur, 784028, India.
ABSTRACT
Precise delineation of anatomical and pathological structures within 3D medical volumes is crucial for ac-
curate diagnosis, effective surgical planning, and longitudinal disease monitoring. Despite advancements
in AI, clinically viable segmentation is often hindered by the scarcity of 3D annotations, patient-specific
variability, data privacy concerns, and substantial computational overhead. In this work, we propose
FALCON, a cross-domain few-shot segmentation framework that achieves high-precision 3D volume
segmentation by processing data as 2D slices. The framework is first meta-trained on natural images to
learn-to-learn generalizable segmentation priors, then transferred to the medical domain via adversarial
fine-tuning and boundary-aware learning. Task-aware inference, conditioned on support cues, allows
FALCON to adapt dynamically to patient-specific anatomical variations across slices. Experiments on
four benchmarks demonstrate that FALCON consistently achieves the lowest Hausdorff Distance scores,
indicating superior boundary accuracy while maintaining a Dice Similarity Coefficient comparable to the
state-of-the-art models. Notably, these results are achieved with significantly less labeled data, no data
augmentation, and substantially lower computational overhead.
1 INTRODUCTION
Accurate segmentation of anatomical structures, such as the liver, kidney, heart, and pathological
regions like brain tumors in MRI, is critical for diagnosis, treatment planning, and monitoring
disease progression, enabling clinicians to assess patient conditions comprehensively and make
informed decisions. This task is typically performed manually by radiologists or clinicians,
rendering it labor-intensive, time-consuming, and subject to variability. To improve efficiency
and consistency, automated segmentation methods based on AI have gained significant interest.
Artificial Intelligence (AI) with Deep Neural Networks (DNNs), particularly those em-
ploying transformer architectures, has shown remarkable progress in general image analysis.
However, applying these models directly to medical imaging faces several challenges: These
models require substantial computational resources for both training and inference, and their
training typically depends on access to large-scale annotations. Particularly for 3D volumes, the
manual creation of masks by clinical experts is prohibitively expensive and time-consuming.
Generative models that create synthetic data offer a promising solution to data and annota-
tion scarcity, yet their clinical adoption is hindered by the need for rigorous validation and
regulatory compliance [U.S. Food & Drug Administration (FDA), 2021a,b]. Conventional data
augmentation techniques, including rotations, scaling, and intensity adjustments, are widely
used but may introduce unrealistic variations that fail to capture clinically relevant features
accurately, potentially undermining model reliability in practice [Elgendi et al., 2021; Pattilachan
et al., 2022; Madani et al., 2018; Tirindelli et al., 2021]. Furthermore, an accurate boundary is
crucial in medical image segmentation, as small localization errors can have significant clinical
*Corresponding Author: fayjie92@gmail.com
arXiv:2601.01687v1 [cs.CV] 4 Jan 2026
consequences, such as inaccurate tumor measurements leading to surgical catastrophe. Com-
monly used loss functions, including cross-entropy and Dice loss, treat all pixels uniformly
and often do not sufficiently emphasize ‘boundary’ regions, limiting segmentation accuracy at
edges [Kervadec et al., 2021].
source domain Ds ∼𝒟
target domain
Dt ∼𝒟′
classes
Cbase
training tasks
tasks
CROSS-DOMAIN
τt
τs
classes
Cnovel
Figure 1. Problem Formulation of Cross-Domain Few-Shot Segmentation (CDFSS). A model is trained
on source tasks τs, involving base classes Cbase from a source dataset Ds ∼D (e.g., natural images). The
objective is to generalize to target tasks τt involving previously unseen classes Cnovel from a distinct target
dataset Dt ∼D′ (e.g., medical imaging). The underlying distributions for the source and the target dataset
are denoted by D and D′. This mimics human cognitive processes where medical trainees acquire broad
foundational knowledge over time and later adapt it to specialize as clinicians. Unlike the source label-rich
source domain, the target domain is characterized by limited data and scarce annotations.
Driven by the need for locally privacy-preserving and resource-efficient medical AI, this
paper proposes that unlabeled slices from a 3D volume for a single patient can provide the
necessary context for high-accuracy segmentation. We hypothesize that a task-aware inference
mechanism enables
This content is AI-processed based on open access ArXiv data.