Restoration of Multichannel Images by Nonlinear Parabolic Partial Differential Equations (PDEs) and Optimization by Convolutional Neural Networks (CNNs) ()
1. Introduction
Image denoising is an essential step in digital image processing [1]. Images acquired by optical sensors are often degraded by various types of noise, including Gaussian, impulse, and Poisson noise [2].
Classical regularization methods, such as the anisotropic scattering model of Pietro Perona and Jitendra Malik, are used to address this issue. The anisotropic scattering model introduced by Perona and Malik represents a key step in the development of edge-preserving denoising methods by introducing a gradient-dependent scattering coefficient. The Leonid Rudin-Stanley Osher-Emad Fatemi (ROF) model, and the Φ-Laplacian equations [2], rely on minimizing an energy that combines data fidelity and spatial regularity. However, these approaches require fine-tuning of the parameters and do not always generalize well to diverse image contexts [3] [4]. Conversely, convolutional neural networks offer the ability to learn complex representations from data and have proven extremely effective in image restoration. “The work of Zhang et al. [3] demonstrated that residual learning-based CNNs achieve cutting-edge performance in image denoising, surpassing classical variational methods.” The U-Net architecture proposed by Ronneberger et al. [5] enables accurate image reconstruction thanks to its jump connections, ensuring the preservation of fine details and spatial structures. Thus, the combination of the two paradigms (continuous mathematical modeling and deep learning) offers a promising hybrid approach, leveraging both the theoretical robustness and the learning power of neural networks [6] [7]. The main objective of this work is to design a color image denoising system combining the advantages of classical variational methods and those of convolutional neural networks (CNNs). More specifically, it is about improving the quality of images altered by Gaussian noise while preserving contours and fine details; comparing the performance of a mathematical model of nonlinear diffusion (Φ-Laplacian) and a neural network trained for the same task; and showing the convergence of the optimization process through the evolution of the PSNR (Peak Signal-to-Noise Ratio) [6].
2. Theoretical Framework
2.1. Variational Φ-Laplacian Model
The image restoration model considered is based on a variational approach consisting of minimizing an energy functional defined on the spatial domain
[1] [2]:
Interpretation of terms
: observed (noisy) image [1].
: desired restored image.
: spatial gradient of the image (detects local variations) [4].
: regularization term.
: regularization parameter controlling the trade-off between fidelity and smoothing.
Choice of the regularization function
We consider the function:
If
→ linear diffusion (classical Gaussian filtering) [1].
If
→ Total variation (TV) type regularization [2].
If
→ non-linear diffusion, allowing a compromise between smoothing and preserving edges [4].
Euler-Lagrange equation
Minimizing the energy leads to the following Euler-Lagrange equation [2] [3]:
where:
2.1.1. Physical Interpretation
This equation can be interpreted as a nonlinear anisotropic diffusion process (For a detailed study of the discretization and stability of nonlinear diffusion equations, see Weickert [8]).
performs adaptive smoothing:
2.1.2. Numerical Discretization
The equation is solved numerically using an explicit scheme [2]:
where:
This scheme allows for progressive iteration toward a stable solution, ensuring convergence to an energy minimum [2] [7].
2.1.3. Extension to Multichannel Images
In this work, multichannel images are mode-led as vector-valued functions. In the PDE approach, channel coupling is explicitly enforced through the joint gradient norm, ensuring a shared diffusion process across channels. In contrast, the CNN implicitly captures inter-channel dependencies through multi-channel convolutional filters, enabling learned correlations between RGB components. This distinction highlights the difference between model-driven and data-driven coupling mechanisms.
In the case of color images
, the gradient becomes vector-valued:
This formulation allows for correlated diffusion across channels, preserving color consistency. The numerical solution of the regularization functional can be efficiently implemented using a primal-dual scheme [9].
2.2. Convolutional Neural Network (CNN)
In addition to the variational model, a convolutional neural network is used to learn the denoising process in a data-driven manner. CNN acts as a universal approximator, capable of learning complex transformations between noisy and clean images [10].
2.2.1. CNN Architecture
Type: Fully convolutional network (3 layers).
Kernel size: 3 × 33 (padding = 1, size conservation).
Number of filters: (Layer 1: 64 filters; Layer 2: 64 filters; Layer 3: 3 filters (RGB reconstruction).
Activation functions: ReLU after each layer except the last.
Input/output: 256 × 256 RGB images.
Training parameters:
The adopted network is a lightweight convolutional autoencoder [3], composed of three main layers:
where (
) denotes the convolution operation, and
is the ReLU activation function [10].
2.2.2. Operating Principle
The CNN learns a mapping function:
which directly approximates the transformation from the noisy input image
to the clean output image
.
2.2.3. Loss Function
The training is based on minimizing the mean squared error (MSE) [6]:
2.2.4. Interpretation
The CNN acts as a universal approximator of denoising operators [7]:
It implicitly learns filters adapted to the image structures.
Unlike PDEs, it does not require explicit tuning of physical parameters.
Learning is performed by backpropagation by optimizing the parameters θ via algorithms such as Adam [6].
2.3. Link between PDEs and CNNs (Unified View)
CNNs (see Figure 1) can be interpreted as a learned discretization of a diffusion process [7]:
Each convolutional layer corresponds to a diffusion iteration.
The learned filters correspond to the adaptive diffusion coefficients.
The entire network is equivalent to an optimized iterative scheme.
This analogy paves the way for modern Deep Unfolding methods, where PDE iterations are transformed into trainable neural network layers [7].
Figure 1. The convolutional neural network.
3. Methodology
The experimental procedure follows these steps:
Data loading and preparation: A color image is loaded and resized to 256 × 256. Gaussian noise with variance σ = 0.06 is added.
Variational denoising (Φ-Laplacian): The nonlinear diffusion method is applied for a fixed number of iterations (300), with the parameters: α = 0.15, p = 1.2, τ = 0.2. The parameters of the PDE (α = 0.15, p = 1.2, τ = 0.2, 300 iterations) were chosen empirically to ensure a compromise between edge preservation, numerical stability, and convergence. The CNN is trained with Adam (learning rate 1e−3) for 200 epochs, allowing for rapid convergence, although it promotes overfitting in this single-image setting.
CNN training: The noisy image is provided as input, and the clean image serves as ground truth. Training takes place over 200 epochs, with a learning rate of 10−3. The CNN is trained and evaluated on the same noisy-clean image pair, resulting in a severe overfitting regime. Therefore, the reported PSNR values should be interpreted as an upper-bound reconstruction performance rather than a measure of generalization ability.
Performance comparison: The results of the two approaches are evaluated using the PSNR metric and visualized by PSNR and loss curves.
4. Results
Experiments show that the Φ-Laplacian model achieves a final PSNR of approximately 30.5 dB, representing a significant improvement over the noisy image (≈24.5 dB). The evolution of the PSNR during iterations shows rapid convergence from the first iterations, followed by stabilization towards an optimum.
The convolutional neural network, on the other hand, efficiently learns the image structure and achieves a higher PSNR (≈32 dB) after 200 training epochs. Visually, the CNN produces smoother and better reconstructed images, while preserving the main contours, whereas the Φ-Laplacian sometimes tends to slightly smooth textures.
The following pictures (Figure 2) illustrate:
Figure 2. Results of the experiment.
The visual comparison between the original, noisy, and denoised image.
The convergence of the PSNR during optimization.
And the decrease in CNN loss during training.
5. Conclusions and Future Work
This work presents a comparative study between a variational PDE-based method (Φ-Laplacian) and a convolutional neural network (CNN) for multichannel image denoising. The two approaches are implemented and evaluated independently on the same noisy in-put.
The results obtained highlight the complementarity between these two paradigms. On the one hand, the Φ-Laplacian-based model offers a rigorous mathematical framework, guaranteeing essential properties such as the stability, convergence, and interpretability of the diffusion process. It enables efficient adaptive smoothing while preserving the contours and fine structures of the images. On the other hand, the CNN demonstrates a remarkable ability to learn complex representations from the data, allowing for superior performance in terms of reconstruction quality, notably as measured by PSNR. However, each of these approaches has limitations: variational methods require precise parameter tuning and can smooth out certain textures, while deep learning models sometimes lack interpretability and theoretical guarantees.
In this context, a natural and promising perspective is to integrate these two approaches within a single, unified framework. This can be achieved by using the Φ-Laplacian as a preprocessing step, or, even more advanced, as a regularization term integrated into the CNN’s loss function, in order to guide learning towards physically consistent and more stable solutions. Furthermore, future extensions could include:
the development of Deep Unfolding models, where the PDE iterations are reformulated as trainable neural network layers.
adapting the model to multispectral and hyperspectral data, where inter-channel correlations are more complex.
The integration of advanced perceptual metrics (such as SSIM) for a more accurate assessment of visual quality.
And the optimization of architectures for real-time applications and embedded systems.
In conclusion, this work confirms that the hybridization of variational analysis and deep learning constitutes a particularly promising research avenue for image restoration. It makes it possible to reconcile theoretical rigor, numerical performance, and interpretability, thus paving the way for more robust systems better suited to real-world applications.