KLO-Net: A Dynamic K-NN Attention U-Net with CSP Encoder for Efficient Prostate Gland Segmentation from MRI

KLO-Net: A Dynamic K-NN Attention U-Net with CSP Encoder for Efficient Prostate Gland Segmentation from MRI
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Real-time deployment of prostate MRI segmentation on clinical workstations is often bottlenecked by computational load and memory footprint. Deep learning-based prostate gland segmentation approaches remain challenging due to anatomical variability. To bridge this efficiency gap while still maintaining reliable segmentation accuracy, we propose KLO-Net, a dynamic K-Nearest Neighbor attention U-Net with Cross Stage Partial, i.e., CSP, encoder for efficient prostate gland segmentation from MRI scan. Unlike the regular K-NN attention mechanism, the proposed dynamic K-NN attention mechanism allows the model to adaptively determine the number of attention connections for each spatial location within a slice. In addition, CSP blocks address the computational load to reduce memory consumption. To evaluate the model’s performance, comprehensive experiments and ablation studies are conducted on two public datasets, i.e., PROMISE12 and PROSTATEx, to validate the proposed architecture. The detailed comparative analysis demonstrates the model’s advantage in computational efficiency and segmentation quality.


💡 Research Summary

This paper introduces KLO-Net, a novel deep learning architecture designed for efficient and accurate prostate gland segmentation from MRI scans, addressing the critical need for real-time deployment in clinical settings. The model is based on a U-Net-style encoder-decoder framework but incorporates two key innovations to bridge the gap between high accuracy and computational efficiency.

The first innovation is the integration of Cross Stage Partial (CSP) blocks into the encoder path. Replacing the standard double convolution blocks of U-Net, CSP blocks split the feature maps into two paths: one undergoes a series of bottleneck transformations, while the other maintains a lightweight identity mapping. These paths are later merged. This design reduces computational redundancy and memory footprint while preserving gradient flow, leading to a more parameter-efficient encoder without sacrificing representational capacity.

The second and core innovation is the proposed Dynamic K-Nearest Neighbor (K-NN) Attention mechanism. Traditional self-attention is computationally prohibitive for high-resolution medical images. While standard K-NN attention reduces cost by having each query attend only to its top-k most similar keys, it uses a fixed ‘k’ value for all spatial positions. KLO-Net’s dynamic variant introduces a small gating network that predicts a position-specific attention density parameter (τ) from the input features. This parameter dynamically determines the number of neighbors (k) each query should attend to, allowing the model to allocate dense attention to complex regions like organ boundaries and sparse attention to homogeneous background areas. This adaptive sparsity maintains the ability to capture long-range contextual dependencies crucial for segmenting anatomically variable structures like the prostate, but at a substantially lower computational cost than full self-attention. This module is strategically placed in the deeper encoder stages and the bottleneck where feature maps are semantically rich but spatially compact.

The authors conduct comprehensive experiments on two public MRI datasets: PROMISE12 and PROSTATEx. Ablation studies on PROMISE12 quantitatively demonstrate the contribution of each component. While adding CSP alone reduced parameters, adding dynamic K-NN attention alone significantly improved segmentation accuracy (Dice Score). The full KLO-Net model, combining both CSP and dynamic K-NN attention, achieved the best balance: it attained the highest Dice Similarity Coefficient (DSC: 0.8555) and the lowest 95% Hausdorff Distance (HD95: 6.4355), while also maintaining a lower parameter count than the baseline U-Net.

Comparative analysis on the PROSTATEx dataset against five state-of-the-art methods (including Attention U-Net, TransUNet, and nnU-Net) showed that KLO-Net achieves competitive or superior segmentation accuracy while boasting significantly lower computational complexity in terms of FLOPs and number of parameters. This validates KLO-Net’s primary advantage: delivering reliable segmentation quality that meets clinical standards, but in a much more efficient architecture that is more feasible for real-time clinical workstation deployment.

In summary, KLO-Net successfully integrates the adaptive context modeling of dynamic sparse attention with the parameter efficiency of CSP network design. It presents a significant step towards practical, efficient, and accurate deep learning solutions for medical image segmentation, specifically for the challenging task of prostate gland delineation in MRI.


Comments & Academic Discussion

Loading comments...

Leave a Comment