A Convolutional Neural Networks Denoising Approach for Salt and Pepper Noise

A Convolutional Neural Networks Denoising Approach for Salt and Pepper   Noise
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The salt and pepper noise, especially the one with extremely high percentage of impulses, brings a significant challenge to image denoising. In this paper, we propose a non-local switching filter convolutional neural network denoising algorithm, named NLSF-CNN, for salt and pepper noise. As its name suggested, our NLSF-CNN consists of two steps, i.e., a NLSF processing step and a CNN training step. First, we develop a NLSF pre-processing step for noisy images using non-local information. Then, the pre-processed images are divided into patches and used for CNN training, leading to a CNN denoising model for future noisy images. We conduct a number of experiments to evaluate the effectiveness of NLSF-CNN. Experimental results show that NLSF-CNN outperforms the state-of-the-art denoising algorithms with a few training images.


💡 Research Summary

The paper addresses the challenging problem of denoising images corrupted by salt‑and‑pepper noise, especially at very high impulse densities (up to 70%). Traditional impulse‑noise filters such as median‑based methods or non‑local means rely heavily on local information and quickly lose effectiveness when the noise density is high because the local neighborhoods become dominated by corrupted pixels. To overcome this limitation, the authors propose a two‑stage framework called NLSF‑CNN (Non‑Local Switching Filter – Convolutional Neural Network).

In the first stage, a Non‑Local Switching Filter (NLSF) is applied to the noisy image. The filter first detects noisy pixels by a simple threshold (δ = 1) that identifies extreme gray‑level values (0 or 255). For each detected pixel, a set of R surrounding patches of size L×L is collected from a non‑local region (i.e., not limited to the immediate neighborhood). The noisy pixel is then replaced by a weighted sum of the medians of these patches. The weight for each patch is computed from the Euclidean distance between the patch and the central patch, using a Gaussian kernel with standard deviation σ. During distance computation, any noisy pixels inside the patches are temporarily replaced by the patch mean to avoid bias. This non‑local averaging dramatically reduces the amount of impulse noise while preserving structural information, providing a much cleaner input for the subsequent learning stage.

The second stage trains a patch‑based CNN on the pre‑processed images. The training set consists of only 91 natural images (augmented with synthetic salt‑and‑pepper noise). Each image is divided into overlapping 64×64 patches. The network has three convolutional layers:

  1. Layer 1: 64 filters of size 9×9, followed by a ReLU activation, producing a high‑dimensional feature map for each patch.
  2. Layer 2: 1×1 convolutions (64 → 32 channels) with ReLU, which mixes channel information while keeping the spatial resolution unchanged.
  3. Layer 3: Filters of size 5×5 that reconstruct the denoised patch.

The loss function is the mean‑squared error (MSE) between the network output and the ground‑truth clean patches, which directly translates to maximizing PSNR. Because the NLSF stage already removes most of the impulsive outliers, the CNN can focus on learning the subtle non‑linear mapping between the mildly corrupted patches and the clean ones, rather than being overwhelmed by extreme noise.

The authors evaluate the method on two test collections: (i) a standard set of 11 benchmark images (e.g., Lena, Baboon, Pepper) and (ii) the BSD300 dataset (300 natural images). For each set, three noise densities are considered: 30 %, 50 %, and 70 %. Four baseline methods are used for comparison: Decision‑Based Algorithm (DBA), Adaptive Switching Non‑Local Means (NASNLM), Patch‑Based Approach to Remove Impulse‑Gaussian Noise (PARIGI), and a Multi‑Layer Perceptron (MLP) trained on a large external dataset. The MLP is also tested with the NLSF pre‑processing (NLSF‑MLP) to isolate the contribution of the filter.

Results show that NLSF‑CNN consistently outperforms all baselines in PSNR. On the 11‑image set, the average PSNR gains over the best traditional method (NASNLM) range from about 3 dB at 30 % noise to more than 6 dB at 70 % noise. Compared with NLSF‑MLP, NLSF‑CNN achieves higher PSNR despite using far fewer training images (91 vs. hundreds of thousands), demonstrating the data‑efficiency of the proposed architecture. Visual comparisons (Figures 3 and 4) confirm that NLSF‑CNN not only removes the salt‑and‑pepper spikes but also preserves fine details such as edges and texture, whereas DBA and NASNLM leave residual artifacts and the MLP tends to oversmooth.

A parameter study on the NLSF patch size (L = 3, 5, 7) reveals that small patches work best for low noise densities, while larger patches become advantageous when the noise level exceeds 30 %, because larger neighborhoods provide enough reliable information under heavy corruption.

The paper’s contributions are threefold: (1) a novel non‑local switching filter that effectively pre‑cleans high‑density impulse noise, (2) a compact CNN architecture that learns the residual mapping from the pre‑processed patches to clean patches, and (3) extensive experiments showing superior performance with a modest training set.

Limitations include the need to manually select the NLSF patch size and the Gaussian kernel bandwidth, the current focus on grayscale images (color extension is not evaluated), and the computational cost of the non‑local weighted averaging, which may hinder real‑time deployment. Future work could explore adaptive parameter selection, joint processing of color channels, and algorithmic acceleration (e.g., using approximate nearest‑neighbor search) to make the method more practical for real‑world applications.

In summary, NLSF‑CNN presents an effective and data‑efficient solution for extreme salt‑and‑pepper noise removal, combining the strengths of non‑local statistical filtering and deep learning to achieve state‑of‑the‑art denoising performance.


Comments & Academic Discussion

Loading comments...

Leave a Comment