MTS-CSNet: Multiscale Tensor Factorization for Deep Compressive Sensing on RGB Images

MTS-CSNet: Multiscale Tensor Factorization for Deep Compressive Sensing on RGB Images
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Deep learning based compressive sensing (CS) methods typically learn sampling operators using convolutional or block wise fully connected layers, which limit receptive fields and scale poorly for high dimensional data. We propose MTSCSNet, a CS framework based on Multiscale Tensor Summation (MTS) factorization, a structured operator for efficient multidimensional signal processing. MTS performs mode-wise linear transformations with multiscale summation, enabling large receptive fields and effective modeling of cross-dimensional correlations. In MTSCSNet, MTS is first used as a learnable CS operator that performs linear dimensionality reduction in tensor space, with its adjoint defining the initial back-projection, and is then applied in the reconstruction stage to directly refine this estimate. This results in a simple feed-forward architecture without iterative or proximal optimization, while remaining parameter and computation efficient. Experiments on standard CS benchmarks show that MTSCSNet achieves state-of-the-art reconstruction performance on RGB images, with notable PSNR gains and faster inference, even compared to recent diffusion-based CS methods, while using a significantly more compact feed-forward architecture.


💡 Research Summary

The paper introduces MTS‑CSNet, a novel compressive sensing (CS) framework for RGB images that leverages Multiscale Tensor Summation (MTS) as a learnable sensing operator and as a structured back‑projection mechanism. Traditional deep CS methods either use convolutional or block‑wise fully‑connected layers to implement the measurement matrix, which restricts the receptive field and scales poorly with image size. Recent approaches such as deep unfolding or diffusion‑based CS improve reconstruction quality but incur heavy iterative computation.

MTS is a structured linear transformation that operates directly on tensors. For each predefined spatial scale (window size) it extracts patches, applies mode‑wise linear projections A(t,sc)ₖ along each tensor mode, and then re‑assembles the transformed patches to the original layout before summation across scales. Mathematically, the operation can be written as

Y = Σ_{sc=1}^{S_C} f^{-1}{w_sc} ( Σ{t=1}^{T} X_t ×₁ A(t,sc)₁ ×₂ … ×_J A(t,sc)_J ),

where f_{w_sc} and f^{-1}_{w_sc} are patch embedding and inverse embedding operators. This formulation yields a sum of block‑diagonal operators, providing a global receptive field while keeping the number of parameters linear in the product of mode dimensions.

In MTS‑CSNet the MTS layer first acts as a compressive sensing operator. By selecting the output dimensions of the projection matrices according to a target compression ratio (CR), the network learns a compact measurement tensor Y. The number of summed components per scale, T, controls the richness of the representation; the authors experiment with T = 12, 24, 48.

For reconstruction, the adjoint of the MTS operator is used as a back‑projection. Unlike conventional CS pipelines that employ a purely linear adjoint, the authors insert a lightweight non‑linearity called MHG (Multiscale Hybrid Gating) before the adjoint mapping. This “proxy reconstruction” captures coarse global structure while preserving cross‑channel correlations.

A lightweight refinement network, MTSNet, follows the proxy step. MTSNet consists of N_B = 3 blocks, each containing four stacked MTS layers with scales


Comments & Academic Discussion

Loading comments...

Leave a Comment