ConvRML: High-Quality Lensless Imaging with Random Multi-Focal Lenslets
Mask-based lensless imagers use simple optics and computational reconstruction to design compact form factor cameras with compressive imaging ability. However, these imagers generally suffer from poor reconstruction quality. Here, we describe several advances in both hardware and software that result in improved lensless imaging quality. First, we use a precision-manufactured random multi-focal lenslet (RML) phase mask to produce improved measurements with reduced multiplexing. Next, we implement a ConvNeXt-based reconstruction architecture, which provides up to 6.68 dB improvement in peak signal-to-noise ratio over state-of-the-art attention-based architectures. Finally, we establish a parallel imaging setup that simultaneously images a scene with RML, diffuser and lens systems, with which we collect datasets with 100,000 measurements for each system, to be used for reconstruction model training and evaluation. Using this standardized system, we quantify the improved measurement quality of the RML compared to a diffuser using the modulation transfer function and mutual information. Our ConvRML system benefits from both the optical and the computational developments presented in this work, and our contributions establish resources to support continued development of high-quality, compact, and compressive lensless imagers.
💡 Research Summary
**
The paper introduces ConvRML, a high‑quality lensless imaging system that combines a precision‑fabricated random multi‑focal lenslet (RML) phase mask with a ConvNeXt‑based deep reconstruction network, and provides a large, standardized dataset for training and evaluation.
Optical encoder: The RML mask consists of overlapping lenslets of varying focal lengths placed at pseudo‑random positions across the aperture. Compared with conventional diffuser masks, which generate highly multiplexed caustic patterns, the RML produces a low‑multiplexing point spread function (PSF) composed of a few sharp focal spots with minimal background. This design reduces measurement contrast loss, improves dynamic range, and is less sensitive to sensor‑to‑mask distance variations. The mask is fabricated using modern two‑photon lithography, enabling high‑resolution, large‑area, repeatable production.
Reconstruction architecture: ConvNeXt, a modern convolutional neural network, is employed as the decoder. Its hierarchical down‑sampling blocks and multi‑scale feature concatenation yield large effective receptive fields while keeping computational cost moderate. The network can capture the long‑range dependencies introduced by the multiplexed PSFs. Separate ConvNeXt models are trained for the RML and for a baseline diffuser system, each using 50 k measurements from the authors’ dataset. Under identical training conditions, the RML‑ConvNeXt pair achieves up to 6.68 dB higher PSNR than state‑of‑the‑art attention‑based models, with an average improvement of 2.99 dB PSNR and 0.116 SSIM over the diffuser‑ConvNeXt baseline.
Standardized parallel dataset: To isolate optical effects from experimental variations, the authors built a parallel imaging rig that simultaneously captures scenes with three systems: a conventional lens (ground truth), a diffuser mask, and the RML mask. All three share the same sensor, illumination, magnification, and alignment. Using automated hardware‑software control, they collected 100 k measurements per system, yielding the “Parallel Lensless Dataset (PLD)” – the largest open‑source lensless imaging dataset to date (300 k measurements total).
Quantitative optical analysis: Modulation Transfer Function (MTF) measurements show that the RML transfers higher spatial frequencies than the diffuser, reflecting its lower multiplexing. Mutual information (MI) analyses evaluate robustness to read noise and quantization. Across a range of read‑noise standard deviations, the RML consistently encodes more information than the diffuser. When quantizing to low bit‑depths (4–12 bits), the RML retains non‑zero MI even at 4 bits, whereas the diffuser’s MI drops to near zero, confirming the RML’s superior dynamic range.
System‑level performance: Real‑world reconstructions demonstrate that ConvRML recovers finer texture, higher contrast, and more accurate color fidelity than the diffuser‑based system. The authors also discuss how the RML’s design reduces sensitivity to alignment errors compared with regular lenslet arrays that require precise focal‑plane positioning.
Contributions and impact: The work advances lensless imaging on three fronts: (1) a manufacturable, low‑multiplexing optical encoder that improves raw measurement quality; (2) a deep learning decoder (ConvNeXt) that efficiently exploits the improved measurements; (3) a large, open‑source dataset enabling fair benchmarking of future algorithms. By showing that better optics and better algorithms together yield substantial gains, the paper sets a new baseline for compact, compressive cameras and provides resources that will accelerate research and potential commercialization in areas such as mobile photography, machine vision, and biomedical microscopy.
Comments & Academic Discussion
Loading comments...
Leave a Comment